Website documentating train arrival delays

Amtrak Unlimited Discussion Forum

Help Support Amtrak Unlimited Discussion Forum:

This site may earn a commission from merchant affiliate links, including eBay, Amazon, and others.
Status
Not open for further replies.
F

Faraz

Guest
I could rather easily write a script that stores data from Amtrak website about delays for trains. So a user could search for Train 6 into Denver and see it's history over the past 7 days to months of how it

s on-time performance was. Is this something useful? Would amtrak frown upon such a site?
 
The Amtrak site already has this sort of feature. Its just that it goes back only five days or so. What you could do is something similar to what has been taking place over at another site, where one poster compiles the on-time stats for selected routes at either specified middle points or end points. the poster then figures the percentage of arrivals that were on time, versus otherwise. Then, a listing of the average delay per selected route is listed. It is then open for discussion as to what the different reasons are for delays over different routes.

Edit: The site mentioned in this post is Railroad.net. Then you will have to search for the "Selected On Time Performance" (or something to that effect) thread. One member of the Amtrak forum there lists the punctuality of a select few East Coast routes, with some statistics regarding ridership, Revenue Passenger Miles, how late a given train was at its' end points, etc.

If we were to do something like that here, the "Train Status" function of the Amtrak website could be used to compile data, and then it could be listed at the end of each month. I think it would be constructive and enlightening to perform an on-time report of the Western LD routes, at an intermediate point. Say, St. Cloud, MN for the Empire Builder, Omaha (or Lincoln, NE) for the California Zephyr, and Raton, NM for the Cheif - or something to that effect. Then list the percentage that each train arrived "on time," and the percent that were delayed, with reasons (if known) for the delays. This could be done by any member of the forum here.
 
Last edited by a moderator:
The Amtrak site already has this sort of feature. Its just that it goes back only five days or so. What you could do is something similar to what has been taking place over at another site, where one poster compiles the on-time stats for selected routes at either specified middle points or end points. the poster then figures the percentage of arrivals that were on time, versus otherwise. Then, a listing of the average delay per selected route is listed. It is then open for discussion as to what the different reasons are for delays over different routes.
What is the other site where this is being done, if you can say?
 
What type of software would be required to automate the extracting of data from websites. I have a website that has Amtrak routes but it is very difficult to maintain from the printed timetable.
 
The Amtrak site already has this sort of feature. Its just that it goes back only five days or so. What you could do is something similar to what has been taking place over at another site, where one poster compiles the on-time stats for selected routes at either specified middle points or end points. the poster then figures the percentage of arrivals that were on time, versus otherwise. Then, a listing of the average delay per selected route is listed. It is then open for discussion as to what the different reasons are for delays over different routes.
You can get the info from Amtrak site, but you have to keep resubmitting the form each time which is a little annoyiing and then yuo forget what the 'average' is. My script would store everything in a database and show you results on one page for as long as there is data available..
 
The Amtrak site already has this sort of feature. Its just that it goes back only five days or so. What you could do is something similar to what has been taking place over at another site, where one poster compiles the on-time stats for selected routes at either specified middle points or end points. the poster then figures the percentage of arrivals that were on time, versus otherwise. Then, a listing of the average delay per selected route is listed. It is then open for discussion as to what the different reasons are for delays over different routes.
You can get the info from Amtrak site, but you have to keep resubmitting the form each time which is a little annoyiing and then yuo forget what the 'average' is. My script would store everything in a database and show you results on one page for as long as there is data available..
I think it would be great. I do something similar for Acela and the Keystone service (my "script" is to manually record the arrivals every couple of days). Amtrak data is public information under the Federal Freedom of Information Act, so they could frown all they want, that data is yours to use as you like. Go for it and please post your results.
 
I like the concept! I'm also curious about the referenced website where OTP stats are currently presented.

Realizing it would be somewhat difficult, I wonder if any circumstantial data, i.e.weather, could be associated with the stats? Just a thought :)
 
I would be very interested in such a site. I wish that I had the coding skills to cook it up myself, but unfortunately I skipped that class.

I'm afraid that you might have some issue with Amtrak if you were to post the results publicly, though. One of the things that you nominally agree to on the Amtrak site is the following:

While using the Site, Materials and/or Software, you agree not to:

Use any robot, spider, site search/retrieval application or other manual or automatic device or process to retrieve, index, “data mine” or in any way reproduce or circumvent the navigational structure or presentation of the Site or its contents;

Now, since Amtrak is a public corporation and is covered by the Freedom of Information act, and because the on-time status of trains is a matter of legitimate public interest, I think that you would have a pretty good case to make in your favor, but I'm not a lawyer and couldn't tell you one way or the other.

But if you do write the script I'd love to see it!
 
That would be awesome! Do it. If Amtrak has a problem with it they tell you with a C&D but I can't imagine why they would object.
 
I've done that a couple of times with Excel just for the heck of it on trips that I'm planning. Unfortunately, my data gets lost in the sea of Excel files...

As for posting the data, if you don't use a spider to grab the info, and if you post the results from your own hard manual labor, I don't think they can say anything about you publicly making them known.
 
Faraz,

Because you're posting using the guest account and including a link, the BB software automatically marks your post as hidden until such time as a moderator can approve the post. We unfortunately were forced to impliment this practice thanks to the idiot spammers of the world. As you can see I have approved one of your posts with the link to your database.

But I wanted you to know why you weren't seeing anything even though you kept posting it over and over. :)
 
Great website! Now i can keep up with how many times the crescent is late getting into my hometown.

So far it has been on time once,2,4,5,7,11,17,22,62,74 and 95 minutes late on both 19 and 20.I seen

the Sunset limited was over 800 minutes late getting into NOl earlier this week.Anyone know what

happened? My guess is the UP stuck them behind every freight train between LAX _ NOL.I hope this

page will be around for a long time.
 
I have completed the webpage here:[redacted temporarily]

All the searches are stored in a database. So if nobody searches for a route over five days then that route will not get stored in the database. Right now it just goes back five days but that will increase from today onwards.

Ill probably fix the bugs then reset the system after a week or so.
Faraz

Nice job and thank you!

I hope you, or anyone else for that matter, didn't take my remark about including "cicumstantial data" out of context. I fully realize this would be pretty difficult to do! I was just thinking the ability to associate weather or other conditions with delays would be beneficial.

Deimos
 
Last edited by a moderator:
In order to do that one would have to take the time to research, for every single train number, as to the cities along the route of THAT train, and then build another bot to pull the weather along THOSE cities only, in the correct order, and to be really useful, since LD trains can take several days to travel a route, you would have to go back in time and pull historical weather data, for the correct period of time, for each of those cities. That would be a horrendous amount of work, it seems to me, sitting down and reading the route of each train in the system, from the timetables on the web site (a bunch of PDF files) , and even then you can't know from the delay data whether meteorological conditions actually had anything to do with the delays for that particular train, or if it was a mechanical problem or if it was caused by being stuck behind a freight that hit a vehicle at a grade crossing, or if there was a medical emergency with a passenger and because of that the train had an extended stop while emergency medical assistance was called, or maybe that train or the freight in front of it had a crew that expired..... The possibilities are endless.
 
I know an initial limitation was that someone must visit the site and request a train status for the system to pull down the ladt 5 day of data.

google cron jobs.. might help you out.
 
I have hidden the link to this site until the moderators and I determine how to fairly apply an old policy regarding promotion of personal websites across the forum.

Back with more soon.
 
Last edited by a moderator:
In order to do that one would have to take the time to research, for every single train number, as to the cities along the route of THAT train, and then build another bot to pull the weather along THOSE cities only, in the correct order, and to be really useful, since LD trains can take several days to travel a route, you would have to go back in time and pull historical weather data, for the correct period of time, for each of those cities. That would be a horrendous amount of work, it seems to me, sitting down and reading the route of each train in the system, from the timetables on the web site (a bunch of PDF files) , and even then you can't know from the delay data whether meteorological conditions actually had anything to do with the delays for that particular train, or if it was a mechanical problem or if it was caused by being stuck behind a freight that hit a vehicle at a grade crossing, or if there was a medical emergency with a passenger and because of that the train had an extended stop while emergency medical assistance was called, or maybe that train or the freight in front of it had a crew that expired..... The possibilities are endless.
I have no doubt this would be nearly impossible....it was just a passing thought since I have noticed some weather related delays:) I fully realize the possibilities are endless and kinda wish I didn't mention the concept here.

Thanks again to Faraz and to our moderators for their efforts to figure out a fair way to link your site.

Deimos
 
This thread encouraged me to do my own stats for the Southwest Chief prior to my departure to San Diego on Tuesday. I think I may have scared myself. Actual on time departure performance (to the minute) for the entire route over the past 7 days has been 32%. Departure from the first schedule stop at KC, MO was 71% on time. That drops to 43% at La Junta, CO, 43% at Albuquerque, NM, 0% at Flagstaff, but with a remarkable 43% ontime arrival performance at LAX. That's wonderful, but the delays into Fullerton where I make a connection mean that I only have a 43% chance of making it to San Diego by 10:10, 57% chance by 11:20, and 86% by 12:25 PM.
 
My experience with the SWC is similar to what your statistics suggest. Leaving the Albuquerque metro area (we sat on a siding for around 1/2 an hour just past the station) we were almost 4 hours late. Arriving in LA we were only about 20 minutes late. The train really flew out of the mountains into California.
 
Just following up if you have decided on a place to link to the page? A lot of forums have a 'useful links' section where members can post relevant links. Perhaps then people can refer to it from there. I have added in a lot of new functionality to the page so should be quite useful now.
 
Status
Not open for further replies.
Back
Top