delphi html parser

Imefungwa Ilichapishwa May 25, 2010 Kulipwa wakati wa kujifungua
Imefungwa Kulipwa wakati wa kujifungua

The goal of this project is to make an "intelligent" html parser to extract data from HTML pages.

This parser should be able to automatically extract data such as:

companyName, address, email, fax, tel, website

this parser must be able to extract N times these data, since html pages will contain tablular data. (N data per page).

[url removed, login to view]();

while ([url removed, login to view]()) do begin;

data:=[url removed, login to view]();

// data should be an object or type like

// [url removed, login to view], [url removed, login to view], [url removed, login to view], [url removed, login to view], [url removed, login to view], [url removed, login to view]

end;

I think a good knowledge of DOM and og REGEX is necessary.

of course it will not work on ALL websites, but should be universal enough.

should work with data from

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

etc..

I think the good startegy would be:

1) find a repetitive fragment in the DOM (when a page contains 20 results, it should extract 20 HTML blocks)

2) apply a parser to each block that contain data to be extracted

Should be DELPHI 6 compatible.

Delphi Uhandisi Microsoft Usimamizi wa Mradi Software Architecture Majaribio ya Software Windows Desktop

Kitambulisho cha Mradi: #3451768

Kuhusu mradi

11 mapendekezo Mradi wa mbali Ipo mtandaoni %project.latestActivity_relativeTime|badilisha%

11 wafanyakazi huru wanazabuni wastani wa $425 kwa kazi hii

IWSolutions

See private message.

$425 USD kwa siku 14
(Maoni 101)
6.7
kraneware

See private message.

$425 USD kwa siku 14
(Maoni 8)
5.9
PaulFarr

See private message.

$425 USD kwa siku 14
(Maoni 33)
4.9
vw7437936vw

See private message.

$425 USD kwa siku 14
(Maoni 19)
4.1
powzak

See private message.

$425 USD kwa siku 14
(Maoni 28)
4.1
devdlrb

See private message.

$425 USD kwa siku 14
(Maoni 1)
0.7
myimservices

See private message.

$425 USD kwa siku 14
(Maoni 0)
0.0
heidelguest

See private message.

$425 USD kwa siku 14
(Maoni 1)
0.0
secureenix

See private message.

$425 USD kwa siku 14
(Maoni 0)
0.0
abeloqp

See private message.

$425 USD kwa siku 14
(Maoni 0)
0.0
bluesoftcoders

See private message.

$425 USD kwa siku 14
(Maoni 3)
2.2