This is a discussion on Ajax Scraping / Real-Time Feed within the AJAX Help forums, part of the Beginners AJAX category; Hi, I want to get real-time prices from a website. I've investigated with Fiddler in a few locations and with different browsers and the behavior I see (or can't see) ...
|
|||||||
|
|
#1 (permalink) |
|
Junior Member
Join Date: Feb 2010
Posts: 3
Rep Power: 0 ![]() |
Ajax Scraping / Real-Time Feed
Hi,
I want to get real-time prices from a website. I've investigated with Fiddler in a few locations and with different browsers and the behavior I see (or can't see) in Fiddler is different. The browser connects to the page and downloads a .swf file. Then the browser either (a) starts polling on a regular basis (b) does something - but nothing is observable in Fiddler. There is a port 80 HTTP address and a port 443 socket address in the XML that's initially sent. I can replicate the regular polling the browser is doing - but would rather connect to a feed, or replicate exactly the same behavior as the browser to avoid getting my IPAddress blocked. I've been reading about the following concepts to try to get to the bottom of what I could do: HTTP Streaming, Persistent Polling, AJAX, Comet, XMLHttpRequests etc. What else should I be looking at? It may be possible to run the .swfs locally to see if that would work - outside the browser. Otherwise, could I create objects like the IFrame that's persistently polling the website? Would it be better to try to avoid all the donkey work the browser does in periodically exchanging viewstate and sessionIDs and just display a webbrowser object in a webform and get at the data through the DOM (I don't know how to do this yet)? Lastly, would I be better off just writing a proxy to sit between the browser and the webserver and capture the XML as it's sent back? I'd appreciate any help at all on this. Thanks. |
|
|
|
| Sponsored Links | |
|
|
#2 (permalink) |
|
Junior Member
Join Date: Feb 2010
Posts: 3
Rep Power: 0 ![]() |
First off, with Flash you will need to get past the <policy-file-request> cross domain security features which is embedded in the flash plug-in. I just spent 3 weeks trying to make a chatroom in Flash only to discover after beating my brains out that the only way to make it work is if you run it on a server that allows you to run a socket with root level access (so that you can run it as a daemon), and you have to set up your port forwarding and server conf files to accept requests on port 843 so that flash can natively access the crossdomain.xml file. ---It ain't no fun.
Basically what this means if you are running a cheap hosting site (like I do) on Godaddy or the likes you will not have the administration rights to make it work. A nice work around would be to make all your scraping/spidering scripted with PHP (Curl) and then convert that into some XML or other format you want to send to flash. This way you are not asking flash to cross domains for data (Let PHP do that part) which is now local on your server (does that make sense?). Good Luck |
|
|
|
|
|
#3 (permalink) |
|
Junior Member
Join Date: Feb 2010
Posts: 3
Rep Power: 0 ![]() |
Thanks for the reply.
>First off, with Flash you will need to get past the <policy-file-request> cross >domain security features which is embedded in the flash plug-in. Why is it that a browser with the flash object embedded in it can access the real-time feed, but I can't replicate what its doing in .Net or java etc? I can decompile the flash object, so should be able to see what calls its making, but at the moment I don't know what IDE I could open the .as file in to see could I compile it locally and run it in debug or something in the IDE. >A nice work around would be to make all your scraping/spidering scripted with >PHP (Curl) and then convert that into some XML or other format you want to >send to flash. This way you are not asking flash to cross domains for data >(Let PHP do that part) which is now local on your server (does that make >sense?). To be honest, I'm probably at a more basic level at the moment than you think. It's probably best if I get a vague conceptual idea of how it could be done and then go a (a) learn php (b) get stuck into javascript again (c) get better at sockets and networking (d) something else. . . I don't want to embed the flash object in my own webpage or anything - I just want to get access to a time-series of the real-time data that's being sent to the .swf object. Typically, the user navigates to the site, chooses a single product and then the flash front-end gets updated with prices. If the user clicks another product, there's a few HTTP posts and then the data for that product starts streaming into the flash object. What I want to do is get access to all the prices for all the products, so I presume I'll have to set up a connection for each of them. What you said above about PHP doesn't make sense at the moment - I don't understand it. Should I get stuck into PHP for the next few weeks to try to get to the bottom of it? What do I need to download etc to be able to write and compile stuff in PHP or curl? Thanks again for the pointers. |
|
|
|
|
|
#4 (permalink) |
|
Junior Member
Join Date: Feb 2010
Posts: 3
Rep Power: 0 ![]() |
Anyone have any other ideas?
I've been reading around this subject over the last few weeks and have encountered Selenium/Firebug/IE Developer Toolbar and Watir/Firewatir/WatiN, but I think I'm on the wrong track. All they will enable me to do is navigate, click and test etc, but when I finally get to the page with the embedded .swf that creates a persistent HTTP connection to my browser, they're no good. Should I pursue the proxy idea? Or is there something in cURL? Ultimately, because there's a choice of about 30 data sources in the .swf object supplying real-time data, to get all the data I need, I have to simulate what 30 browsers open side-by-side would be doing: ie: a browserless connection to collect the XML from 30 data sources on the same server at the same time. I'd appreciate any nod in the right direction. |
|
|
|
![]() |
| Bookmarks |
|
|
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| ajax real time chat? | alexus | AJAX Help | 0 | 08-04-2009 06:05 PM |
| SYS-CON Webcast - Enterprise Comet: Real-Time, Real-Time, or Real-Time Web 2.0? | microbee | AJAX Help | 0 | 08-28-2008 02:24 PM |
| The Right Time for Real Time Java | microbee | AJAX Help | 0 | 07-24-2008 06:43 AM |
| How to Create a Real, Real-Time Experience with Enterprise Comet | microbee | AJAX Help | 0 | 08-13-2007 04:07 PM |
| Real-World AJAX Book Preview: The AJAX News and Feed Reader | microbee | AJAX Help | 0 | 05-20-2007 12:13 AM |