You're assuming that past data is stored in a database accessible to the program that makes the service predictions.
Given that the website only gives access to a few days worth of data, that's likely a bad assumption.
Regardless of how easy you may think it is, it doesn't come free, and you may have noticed that Amtrak isn't exactly swimming in cash.
Let's say it is in a different database. It still isn't hard to copy that data back onto the database used for service predictions.
Honestly, I don't think that the reason they do optimistic/best-case predictions is because it's too hard to program. More than likely, it's because they don't want to predict the train to come in (or more importantly, leave) later than it actually does. If Amtrak stated that the train is estimated to leave at 10:45 AM on the status page, and it actually leaves at 10:30 AM, someone would complain because they missed their train because the Amtrak website said that the train won't leave until 10:45 AM. Even if the schedule (and their ticket/reservation) says that the train leaves at 9:45 AM and Amtrak posted that this is an estimated time and that trains often make up time en route, so it may be there (and leave) sooner than shown here, etc. etc. Someone will complain.
And that's why we can't have as many nice things.


