THBPdf Download Contact Us Buy Online Developerse-mail me

PDF index file converted into a SQL database




Message-ID:<hinr3k$2lvd$1@saria.nerim.net>
Subject:

PDF index file converted into a SQL database?


Date:Thu, 14 Jan 2010 20:25:27 +0100


Hello,
When the Acrobat's search engine is operated on PDF files it generates an 
IDX index file. Is it possible to convert it into a SQL database?
Thanks a lot for any information about this.

Daniel
Paris






Message-ID:<v7ydnZFa-cEu79LWnZ2dnUVZ_uGdnZ2d@posted.localnet>
Subject:

Re: PDF index file converted into a SQL database?


Date:Thu, 14 Jan 2010 20:38:27 +0100


At Thu, 14 Jan 2010 20:25:27 +0100 "Daniel" <daniel.frydman_no_spam@metacrawler.com> wrote:

> 
> Hello,
> When the Acrobat's search engine is operated on PDF files it generates an 
> IDX index file. Is it possible to convert it into a SQL database?
> Thanks a lot for any information about this.

What *exactly* is an 'IDX index file'? If this is some sort of text
file, it should be possible to convert this (with tools like Perl or
awk) to a set of SQL statements, which in turn can be fed to a SQL
client program.

> 
> Daniel
> Paris
> 
> 
>                                     

-- 
Robert Heller             -- 978-544-6933
Deepwoods Software        -- Download the Model Railroad System
http://www.deepsoft.com/  -- Binaries for Linux and MS-Windows
heller@deepsoft.com       -- http://www.deepsoft.com/ModelRailroadSystem/
                                             




Message-ID:<hipm2u$8c1$1@saria.nerim.net>
Subject:

Re: PDF index file converted into a SQL database?


Date:Fri, 15 Jan 2010 13:11:58 +0100


The IDX file is the output format of the index function of Acrobat. Open as 
a text file, the content of the file is full of specific data encoded by 
Acrobat. That's the problem: how to convert a specific file to a standard 
data file appropriate to a SQL database?

Daniel

"Robert Heller" <heller@deepsoft.com> a écrit dans le message de news: 
v7ydnZFa-cEu79LWnZ2dnUVZ_uGdnZ2d@posted.localnet...
> At Thu, 14 Jan 2010 20:25:27 +0100 "Daniel" 
> <daniel.frydman_no_spam@metacrawler.com> wrote:
>
>>
>> Hello,
>> When the Acrobat's search engine is operated on PDF files it generates an
>> IDX index file. Is it possible to convert it into a SQL database?
>> Thanks a lot for any information about this.
>
> What *exactly* is an 'IDX index file'? If this is some sort of text
> file, it should be possible to convert this (with tools like Perl or
> awk) to a set of SQL statements, which in turn can be fed to a SQL
> client program.
>
>>
>> Daniel
>> Paris
>>
>>
>>
>
> -- 
> Robert Heller             -- 978-544-6933
> Deepwoods Software        -- Download the Model Railroad System
> http://www.deepsoft.com/  -- Binaries for Linux and MS-Windows
> heller@deepsoft.com       -- http://www.deepsoft.com/ModelRailroadSystem/
>
> 






Message-ID:<PuWdnS4oiL2T783WnZ2dnUVZ_hmdnZ2d@posted.localnet>
Subject:

Re: PDF index file converted into a SQL database?


Date:Fri, 15 Jan 2010 14:47:58 +0100


At Fri, 15 Jan 2010 13:11:58 +0100 "Daniel" <daniel.frydman_no_spam@metacrawler.com> wrote:

> 
> The IDX file is the output format of the index function of Acrobat. Open as 
> a text file, the content of the file is full of specific data encoded by 
> Acrobat. That's the problem: how to convert a specific file to a standard 
> data file appropriate to a SQL database?

You'll need to figure out what Adobe is doing and then write a program
(eg a Perl script or something) than converts it to SQL.

> 
> Daniel
> 
> "Robert Heller" <heller@deepsoft.com> a écrit dans le message de news: 
> v7ydnZFa-cEu79LWnZ2dnUVZ_uGdnZ2d@posted.localnet...
> > At Thu, 14 Jan 2010 20:25:27 +0100 "Daniel" 
> > <daniel.frydman_no_spam@metacrawler.com> wrote:
> >
> >>
> >> Hello,
> >> When the Acrobat's search engine is operated on PDF files it generates an
> >> IDX index file. Is it possible to convert it into a SQL database?
> >> Thanks a lot for any information about this.
> >
> > What *exactly* is an 'IDX index file'? If this is some sort of text
> > file, it should be possible to convert this (with tools like Perl or
> > awk) to a set of SQL statements, which in turn can be fed to a SQL
> > client program.
> >
> >>
> >> Daniel
> >> Paris
> >>
> >>
> >>
> >
> > -- 
> > Robert Heller             -- 978-544-6933
> > Deepwoods Software        -- Download the Model Railroad System
> > http://www.deepsoft.com/  -- Binaries for Linux and MS-Windows
> > heller@deepsoft.com       -- http://www.deepsoft.com/ModelRailroadSystem/
> >
> > 
> 
> 
>                                                                                                    

-- 
Robert Heller             -- 978-544-6933
Deepwoods Software        -- Download the Model Railroad System
http://www.deepsoft.com/  -- Binaries for Linux and MS-Windows
heller@deepsoft.com       -- http://www.deepsoft.com/ModelRailroadSystem/
                                                 




Message-ID:<7rgm5mFqbtU2@mid.individual.net>
Subject:

Re: PDF index file converted into a SQL database?


Date:Sun, 17 Jan 2010 15:46:14 +0100


Daniel wrote:
> The IDX file is the output format of the index function of Acrobat. Open as 
> a text file, the content of the file is full of specific data encoded by 
> Acrobat. 

Is it a text file or some proprietary binary format?

If it's text, can you post a short chunk of it so we can see?

> That's the problem: how to convert a specific file to a standard 
> data file appropriate to a SQL database?

That's not the problem: any decent programming language can do this.

The problem is knowing what the content of the .idx file is and what it
means. Unless you have access to Adobe's specification of this (maybe
it's part of the PDF Spec; I don't know), then all the conversions in
the world won't help...

///Peter




Message-ID:<hj7hql$1u2l$1@saria.nerim.net>
Subject:

Re: PDF index file converted into a SQL database?


Date:Wed, 20 Jan 2010 19:25:24 +0100


> Is it a text file or some proprietary binary format?
It's a proprietary format, I'm afraid. When I open it into in text file 
there are a lot of not recognized characters (squares).

That's why I'm seeking a soft whose editor had access to Adobe's 
specification.

Daniel

"Peter Flynn" <peter.nosp@m.silmaril.ie> a écrit dans le message de news: 
7rgm5mFqbtU2@mid.individual.net...
> Daniel wrote:
>> The IDX file is the output format of the index function of Acrobat. Open 
>> as a text file, the content of the file is full of specific data encoded 
>> by Acrobat.
>
> Is it a text file or some proprietary binary format?
>
> If it's text, can you post a short chunk of it so we can see?
>
>> That's the problem: how to convert a specific file to a standard data 
>> file appropriate to a SQL database?
>
> That's not the problem: any decent programming language can do this.
>
> The problem is knowing what the content of the .idx file is and what it
> means. Unless you have access to Adobe's specification of this (maybe
> it's part of the PDF Spec; I don't know), then all the conversions in
> the world won't help...
>
> ///Peter
> 






Message-ID:<14lfojsjoe19k.nsffchtqawxq.dlg@40tude.net>
Subject:

Re: PDF index file converted into a SQL database?


Date:Wed, 20 Jan 2010 19:40:33 +0100


Daniel schrieb:

>> Is it a text file or some proprietary binary format?
> It's a proprietary format, I'm afraid. When I open it into in text file 
> there are a lot of not recognized characters (squares).
> 
> That's why I'm seeking a soft whose editor had access to Adobe's 
> specification.

What about this one: "PDF Manager":
http://www.aks-labs.com/solutions/pdf-manager/index-pdf-file.htm

Robert




 

|THBPdf| |Download| |Developers|