- As far as the compactness of the data, the text can be smaller in many situations however each of the lines containing 20+ fields will need to be converted from ASCII to their binary representations through calls to sscanf() or atof().
So 50,000 records times 20 conversions would be around 1,000,000 conversions from ASCII to floating point. If you had these fields stored in binary form there would be no overhead for these conversions.
While it might be a pretty long blink of an eye, 50,000 lines of text with conversions should not take a great deal of time. Graphing this data on the other hand could take quite a bit longer depending on the charting library used.
--- In email@example.com, Sander Pool <sander_pool@...> wrote:
> Sorry but I have to jump in here before text files get a bad rep :)
> Could be they are to blame in this case but I doubt it.
> Text files are excellent containers for data. There is nothing
> intrinsically different between text and binary files. Both are bytes
> stored on a medium. Interpretation determines whether it's text or binary.
> Reading text files line by line is easily done in any program language
> and very quickly too. It is true that you can pack more data in binary
> files but since we're only talking 50K records -per year- this is hardly
> a concern. Files that 'big' open easily in gvim, notepad and other
> editors so reading them line by line should not choke any reasonably
> well written program.
> Now if you want to run queries on your data you could store it as a
> database (say SQLite or whatever) but that's overkill in this case.
> There's only one table and thus no relationships. Even old computers can
> easily read a 10M text file in memory so you can search/process and
> rewrite as needed. New computers will hardly blink reading all those 50K
> records into memory. This even works in interpreted languages like Perl
> and ruby, let alone in c/c++.
> On 12/29/2010 7:44 PM, Jon Golubiewski wrote:
> > >>Perhaps it is just the size of the data file? Currently the template
> > I am using
> > >>writes to the file only once per 10 mins. This would result in 144
> > records per day,
> > >>51,264 per year.
> > I think you probably hit the nail on the head! Text files are poor
> > containers for reading and writing large amount of data - database
> > files are optimized for this sort of task. Pushing text files too
> > large will eventually choke them. I've done tech support for many
> > years and have seen limitations even in reading database files! :-p
> > But, hey - sometimes you gotta work with what's available to you due
> > to some restraints in architecture (OS, your program, etc..) It
> > sounds like daily files, or aggregated/processed data might be the way
> > for longer-term graphs. But that's probably not in your control. I'm
> > just talking out loud and probably not adding anything more to this
> > conversation. Just my $0.03
> > The graphing looks pretty cool! I'll have to check this out some
> > more! :-D
> > ~ Jon G.