While I understand the points I think it's worth being kinder about someone coming out to write about how they failed with a project.
> 1. The title makes it sound like the author spent a lot of time on this project. But really, this mostly consisted of noting down a couple of URLs per day. So maybe 5 min / day = ~130h spent on the project. Let's say 200h to be on the safe side.
Consistent work over multiple years shouldn't be looked down on like this. If you've done something every day for years it's still a lot of time in your life. We're not econs and so I don't think summing up the time really captures it either.
> 3. "If I would have finished the project, this dataset would then have been released" ==> There is literally nothing stopping OP from still doing this. It costs maybe 2h of work and would potentially give a substantial benefit to others, i.e., turn this project into a win after all. I'm very puzzled why OP didn't do this.
They might not realise how to do this sustainably, they might be mentally just done with it. It may be harder for them to think about.
I'd recommend also that they release the data. If they put it on either Zenodo or Figshare it'll be hosted for free and referenceable by others.
> 2. "Get first analyses results out quickly based on a small dataset and don’t just collect data up front to “analyse it later”" => I think this actually killed the project.
I agree, but again on the kinder side (because they also agree I think) there are multiple reasons for doing this and focusing on why might be more productive.
1. It gets you to actually process the data in some useful form. So many times I've seen things fail late on because people didn't realise something like "how are dates formatted" or whether some field was often missing or you just didn't capture something that turns out to be pretty key (e.g. scrape times then realise that at some point they changed it to "two weeks ago" and you didn't realise).
This can be as simple as just plotting some data, counting uniques, anything. The automated system will fall over when things go wrong and you can check it.
2. What do people care about? What do you care about? Sometimes I've had a great idea for an analysis only to realise later maybe I'm the only one that cares or worse, the result is so obvious it's not even interesting to me.
3. Keeping interest. Keeping interest in a multi-year project that's giving you something back can be easier than something that's just taking.
4. Guilt. If I spend a long time on something, I feel it should be better. So I want to make it more polished, which takes time, which I don't have. So I don't add to it, then I'm not adding anything, then nothing happens. It shouldn't matter, but I've long realised that just wishing my mind worked differently isn't a good plan and instead I should just plan for reality. For that, doing something fast feels much better - I am happier releasing something that's taken me half a day and looks kinda-ok because
5. Get it out before something changes. COVID had or has no upfront endpoint.
6. Ensure you've actually got a plan. Unless you've got a very good reason, you can probably build what you need to analyse things and release it earlier. You can't run an analysis on an upcoming election, but even then you could do it on a previous year and see things working. This can help with motivation because at the end you don't have "oh right now I need to write and run loads of things" you just need to hit go again.