Find Jobs
Hire Freelancers

Create Google Site Search Crawler for Jonathan Siennicki

$250-750 USD

Imefungwa
Imechapishwa about 6 years ago

$250-750 USD

Kulipwa wakati wa kufikishwa
The Jonathan Siennicki Project: Create a software in the language C or C++ (or script like in python or php) that is multi threaded, supports http and https proxys - and able to scrape Google, Yahoo, & Bing results to identify Google Site Search (and competing sites like [login to view URL], Algolia, etc) by the javascript and/or various footprints used. (similar to scraping Google results to see which sites have Google Analytics) Crawled results will filter out duplicates of top level domains (TLD). So if there is a www. or just a TLD domain, we want to only use one. Tasks: 1. Identify contact page. (most often [login to view URL] and/or [login to view URL], but can vary) Scrape the contact name (if appears), the email addresses, and the telephone numbers. This information should be saved in a Excel file. Would love it if could also submit to wordpress sites, etc and perhaps support captcha api for that purpose. Lots of code on github/sourceforge for this. 2. Extract WHOIS Contact information and if possible, also on the websites Contact page. The only thing we want is the website, contact name, email address and telephone number of webmaster. The software will have an option to save the output according to the search engine it was crawled from in Excel, CVS or TXT. 3. The software should have a built in WYSIYG editor, and support multiple SMTP credentials and proxys for sending emails, and have the ability to do the scraping and task the emails immediately thereafter. Menu Options of Software: Define how many sites to crawl. Define tasks related to emails. There should be a menu to output success/error logs, which search engines to crawl (all are defaulted), which search engines to use (all are defaulted), ability to configure additional footprints, and produce output of results (compile database) in excel or CVS format. Also, the menu should show how many proxies are working, and should randomly use them when extracting from search engines. Output: For example, if we selected "Google Site Search" and used "Google, Bing and Yahoo" to get the results, we should be able to create a database based on that. By default, all search engines are used to find all the results for each 'site search' platform and duplicate domains are erased before following up with checking for website CONTACT and WHOIS info Note: There is tons of open source for proxies, scraping, WHOIS lookups, etc and everything written here on Sourceforge and Github. So this is like a lego. If you can create the Jonathan Siennicki software, please let me know the language (can be web, but preferably a binary application). Please give a price, time of delivery, and also any software resume you have to convince us your the right person for the job. We've prepared a document that is attached with various footprints of the various platforms offering a similar service to Google Site Search.
Kitambulisho cha mradi: 16532957

Kuhusu mradi

11 mapendekezo
Mradi wa mbali
Inatumika 6 yrs ago

Unatafuta kupata pesa?

Faida za kutoa zabuni kwenye Freelancer

Weka bajeti yako na muda uliopangwa
Pata malipo kwa kazi yako
Eleza pendekezo lako
Ni bure kujiandikisha na kutoa zabuni kwa kazi
11 wafanyakazi huru wana zabuni kwa wastani $481 USD kwa kazi hii
Picha ya Mtumiaji
Hello. I am full experience with C++ C C# dotnet aspnet and windows desktop application development. You will be satisfied with my great result. I can implement google search crawler Best regards
$555 USD ndani ya siku 10
4.7 (45 hakiki)
6.2
6.2
Picha ya Mtumiaji
with 10 years experience in Wordpress, PHP, Woocommerce, SEO. I can help you finish projects which you probose. we can communicate each other easily.
$666 USD ndani ya siku 10
5.0 (5 hakiki)
2.7
2.7
Picha ya Mtumiaji
 11+ Years of IT experience  Extensive knowledge in Full stack web development using HTML5, CSS3, LESS, SASS, Postcss, JavaScript, ES6, JQuery, ReactJS, Redux, flux, D3, NodeJS, Angular 2.0, GUI desings, webpack, Gulp, Java and Python.  Good knowledge on data structures and algorithms in JavaScript  Strong proficiency in JavaScript, including DOM manipulation and the JavaScript object model  Thorough understanding of React.js and its core principles  Good understanding of Google Cloud and AWS  Good knowledge on Google Cloud data storage, API, BigQuery, Cloud SQL and App Engine  Good knowledge in designing GUI  Experience with popular React.js workflows (such as Flux or Redux)  Experience with common front-end development tools such as Babel, Webpack, NPM  Proficient in building Web User Interface (UI) using HTML5, DHTML, table less XHTML, CSS3 and Java Script that follows W3C Web Standards and are browser compatible  Experienced in building cross browser compatibility applications using HTML5 and CSS3.  Good knowledge in MVC architecture and understanding concepts on Model, View, controllers  Expertise in debugging and troubleshooting existing code using Developer Tools like Firebug, chrome, IE explorer and Safar  Extensively used ES6 new features  Based on the functional requirements, designed component state diagram for React  Worked web accessibility issues  Involved in designing Responsive web design  Involved in Web UI architecture
$555 USD ndani ya siku 15
0.0 (0 hakiki)
0.0
0.0

Kuhusu mteja

Bedera ya ISRAEL
Budapest, Israel
5.0
1
Njia ya malipo imethibitishwa
Mwanachama tangu Jul 5, 2006

Uthibitishaji wa Mteja

Asante! Tumekutumia kiungo cha kudai mkopo wako bila malipo kwa barua pepe.
Hitilafu fulani imetokea wakati wa kutuma barua pepe yako. Tafadhali jaribu tena.
Watumiaji Waliosajiliwa Jumla ya Kazi Zilizochapishwa
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Onyesho la kukagua linapakia
Ruhusa imetolewa kwa Uwekaji wa Kijiografia.
Muda wako wa kuingia umeisha na umetoka nje. Tafadhali ingia tena.