logoalt Hacker News

jesse__yesterday at 4:40 PM15 repliesview on HN

I've done a handful of interviews recently where the 'scaling' problem involves something that comfortably fits on one machine. The funniest one was ingesting something like 1gb of json per day. I explained, from first principals, how it fits, and received feedback along the lines of "our engineers agreed with your technical assessment, but that's not the answer we wanted, so we're going to pass". I've had this experience a good handful of times.

I think a lot of people don't realize machines come with TBs of RAM and hundreds of physical cores. One machine is fucking huge these days.


Replies

kevmo314yesterday at 5:11 PM

The wildest part is they’ll take those massive machines, shard them into tiny Kubernetes pods, and then engineer something that “scales horizontally” with the number of pods.

show 6 replies
yndoendoyesterday at 6:07 PM

I recently had to parse 500MB to 2GB daily log files into analytical information for sales. Quick and dirty, the application would of needed 64GB RAM and work laptop only has 48GB RAM. After taking time cleaning it up, it was using under 1GB of RAM and worked faster by only retaining records in RAM if need be between each day.

It is not about what you are doing, it is always about how you do it.

This was the same with doing OCR analysis of assembly and production manuals. Quick and dirty, it would of took over 24 hours of processing time, after moving to semaphores with parallelization it took less than two hours to process all the information.

show 1 reply
bauerdyesterday at 6:26 PM

In interviews just give them what they are looking for. Don't overthink it. Interviews have gotten so stupidly standardized as the industry at large copied the same Big Tech DSA/System Design/Behavioral process. And therefore interview processes have long been decoupled from the business reality most companies face. Just shard the database and don't forget the API Gateway

show 4 replies
dehrmannyesterday at 6:36 PM

> but that's not the answer we wanted

You could have learned this if you were better about collecting requirements. You can tell the interviewer "I'd do it like this for this size data, but I'd do it like this for 100x data. Which size should I design this for?" If they're looking for one direction and you ask which one, interviewers will tell you.

show 1 reply
anshumankmrtoday at 9:26 AM

Though I do not know the situation AT the firm you were interviewing in, if there is some unexpected increase in data volume OR say a job fails on certain days or you need to do some sort of historical data load (>= 6 months of 1 gig of data per day), the solution for running it on a single VM might not scale. BUT again, interviews are partially about problem solving, partially about checking compliance at least for IC roles (IN my anecdotal experience).

That being said yeah I too have done some similar stuff where some data engineering jobs could be run on a single VM but some jobs really did need spark, so the team decision was to fit the smaller square peg into a larger square peg and call it a da.In fact, I had spent time refactoring one particular pivotal job to run as an API deployed on our "macrolith" and integrated with our Airflow but it was rejected, so I stopped caring about engineering hygiene.

show 3 replies
winridtoday at 5:31 AM

I've actually worked on distributed systems that were so broken, I created a script to connect to prod and just create the report from my laptop. My manager offered to buy me a second laptop for running the report since it was easier than getting approval from the architects to get rid of the distributed report system (it only created that one report).

coliveirayesterday at 5:09 PM

Yes, but then how are these people going to justify the money they're spending on cloud systems?... They need to find only reasons to maintain their "investment", otherwise they could be held as incompetent when their solution is proven to be ineffective. So, they have to show that it was a unanimous technical decision to do whatever they wanted in the first place.

ytoawwhra92yesterday at 10:27 PM

> I explained, from first principals, how it fits, and received feedback along the lines of "our engineers agreed with your technical assessment, but that's not the answer we wanted, so we're going to pass". I've had this experience a good handful of times.

Probably a better outcome than being hired onto a team where everyone know you're technically correct but they ignore your suggestions for some mysterious (to you) reason.

show 1 reply
franciscoptoday at 12:24 AM

I have a funny story I need to tell some day about how I could get a 4GB JSON loaded purely in the browser at some insane speed, by reading the bytes, identifying the "\n" then making a lookup table. It started low stakes but ended up becoming a multi-million internal project (in man-hours) that virtually everyone on the company used. It's the kind of project that if started "big" from the beginning, I'd bet anything it wouldn't have gotten so far.

Edit: I did try JSON.parse() first, which I expected to fail and it did fail BUT it's important that you try anyway.

show 1 reply
ahartmetzyesterday at 5:53 PM

Every one of these cores is really fast, too!

show 1 reply
colechristensenyesterday at 8:10 PM

Yeah I had this problem at a couple of times in startup interviews where the interviewer asked a question I happened to have expertise in and then disagreed with my answer and clearly they didn't know all that much about it. It's ok, they did me a favor.

It may or may not be related that the places that this happened were always very ethnically monotone with narrow age ranges (nothing against any particular ethnic group, they were all different ethnic monotones)

show 1 reply
badgersnakeyesterday at 6:54 PM

This kind of bad interview is rife. It’s often more a case of guess what the interviewer thinks than come up with a good solution.

sharadovtoday at 2:31 AM

Yes, yes but how are we going to get HA with one machine..

Fuck off ..you're 10 person startup with an MVP and no revenue stream needs customers first..

jitlyesterday at 8:25 PM

1gb of json u can do in one parse ¯\_(ツ)_/¯ big batches are fast

yieldcrvyesterday at 5:23 PM

“there’s no wrong answer, we just want to see how you think” gaslighting in tech needs to be studied by the EEOC, Department of Labor, FTC, SEC, and Delaware Chancery Court to name a few

let’s see how they think and turn this into a paid interview