[GET] Content Scraper

bigcajones

Client
Joined
Feb 9, 2011
Messages
1,228
Reaction score
685
Points
113
After seeing a WSO on how to scrape content from YouTube vids I decided that I would try to make a template for it instead of purchasing the software.

I've also seen a lot of questions about certain things like C# code, GAC references and HTTP Requests on the forum lately. This template has it all for your reference.

The template takes your keyword(s) from a file and goes to YouTube and searches for videos that have Closed Captioning. It will scrape all video ID's and then go to videos.Google to scrape the content of the CC text. You won't see this happening because I used HTTP GET to scrape.

The page text is in XML so I have included a C# action and added a reference to the XML library to clean up the XML into readable text. I also did a little cleaning up of the text when the action is finished with it.

A folder is created named by your keyword so you will know what text goes with what keyword. The files are saved by the video watch ID so that if there is some text that doesn't make sense, you can go to the video it was pulled from and clean up the text if needed.

Just a few points here:

This content is probably not good for your main site since a lot of the vids are published on other platforms that include the transcript with the video, so you would have duplicate content on your site.

You will need a proxy.txt file and a keyword.txt file in the project directory. You can take the proxy action out, but we all know what happens if you scrape Google too much from one IP.

The template is open source, free for everyone to use so I don't want to see it sold on here as your own.

The content is not perfect, but it can be fixed. It would be good for spinning and using for GSA, SENuke or Zenno blasts on whatever tiers you might be building.

Good luck with it and if you have problems, just let me know.

View attachment YTCC.xmlz
 

Hungry Bulldozer

Moderator
Joined
Jan 12, 2011
Messages
3,441
Reaction score
837
Points
113

bigcajones

Client
Joined
Feb 9, 2011
Messages
1,228
Reaction score
685
Points
113
Thanks HB. Just trying to give back a little.
 
  • Thank you
Reactions: Hungry Bulldozer
Joined
Jul 31, 2012
Messages
99
Reaction score
14
Points
8
  • Thank you
Reactions: bigcajones

amul

Client
Joined
Jul 2, 2011
Messages
147
Reaction score
10
Points
18
As usual you Rock!
 

Users Who Are Viewing This Thread (Total: 1, Members: 0, Guests: 1)