Software for reading writing and updating pdf files
They were designed to be fast on very old computers.
If you wanted to write a from-scratch binary importer, you’d have to support things like the Windows Metafile Format (for drawing things) and OLE Compound Storage.And these “specs” look more like C data structures than what we traditionally think of as a spec. If you started reading these documents with the hope of spending a weekend writing some spiffy code that imports Word documents into your blog system, or creates Excel-formatted spreadsheets with your personal finance data, the complexity and length of the spec probably cured you of that desire pretty darn quickly.A normal programmer would conclude that Office’s binary file formats: You’d be wrong on all four counts.With a little bit of digging, I’ll show you how those file formats got so unbelievably complicated, why it doesn’t reflect bad programming on Microsoft’s part, and what you can do to work around it.The first thing to understand is that the binary file formats were designed with very different design goals than, say, HTML.If you’re running on Windows, there’s library support for these that makes it trivial...
using these features was a shortcut for the Microsoft team.
But if you’re writing everything on your own from scratch, you have to do all that work yourself.
Last week, Microsoft published the binary file formats for Office.
These formats appear to be almost completely insane.
The Excel 97-2003 file format is a 349 page PDF file. This document includes the following interesting comment: You see, Excel 97-2003 files are OLE compound documents, which are, essentially, file systems inside a single file.
These are sufficiently complicated that you have to read another 9 page spec to figure that out.