That’s a lot of duplicates! I’ve written a script to remove duplicates. I’ve run that script multiple times, and every few months when I go to create a new lifetime overlay of all my rides, I’ve noticed the duplicates have reappeared. Well, it’s been a couple months and they are back!
My first thought was something must be wrong with my script to retrieve all my rides that match a particularly tricky set of conditions to distinguish real rides from route planning. Still, it is literally just a complicated “SELECT” query so I quickly dismissed the idea that this was the source of the problem.
Then I looked at the dates of the duplicates and noticed that the copies were being created at suspiciously regular intervals. My next thought was some sort of “backup” cron job gone wrong. Two problems with that:
- I’m only dumping the entire database as part of my backup. I’m not copying individual routes as part of the backup.
- The dates weren’t regular enough to be any kind of cron job. The creation dates for each duplicate were roughly 2-5 days apart from the last time I deleted all the duplicates.
My first thought about that kind of pseudo regularity was some sort of web crawler or robot script, and then it hit me what the likely problem was: I have a hyperlink to “copy” a route which is super helpful for me when I’m doing route planning. Robots / web crawlers were hitting this link every few days and able to make a copy of the route.
Unfortunately, I didn’t properly check ownership of the route, which I do for the “edit” and “delete” links, but not for copy because I wanted other people to be able to make copies of my routes. Unfortunately, the way I was checking this meant that even if you weren’t logged in you could create a copy and it would preserve all information including “owner” so that it would show up in my account as a new copy of the ride.
Problem solved … simply restrict this function to logged in users, which then properly transfers ownership of the copy to them (instead of leaving it as a copy into my account), run the “delete duplicates” script again, and look forward to NOT having to delete a bunch of duplicates again in a few months when I’m ready to create my next overlay map like the most recent one below capturing my South Carolina, Georgia, Alabama, and Tennessee spring break rides: