The Data Scientific Research Experiment That Went Viral (and Broke My Laptop)

When versions scale faster than you expect, chaos follows.

A couple of months ago, I believed it would be enjoyable to run a “tiny” data science experiment. The plan? Scratch hundreds of product testimonials, run belief analysis, and imagine the cause a tidy dashboard.

It began innocent enough. Yet 3 hours later on, my laptop fans were shrieking like a Boeing 747 Chrome iced up. Spotify stammered. And Jupyter Notebook appeared like it was running on dial-up net.

What went wrong? Well, I ignored simply exactly how quick poor code scales right into disaster.

Here’s the tale and the 3 lessons I’ll always remember.

Mistake 1: Pandas All over

My very first step was loading all reviews (2 3 M rows) right into Pandas. Because, naturally, that’s what every tutorial does.

  import pandas as pd 
 df = pd.read _ csv("reviews.csv") 
 beliefs = df ["review"] use(analyze_sentiment)

The problem? Pandas doesn’t care that your laptop computer has 8 GB RAM. It will happily eat everything, and afterwards some.

Within minutes, I was staring at an icy display. Lesson learned: Pandas is powerful, however it’s not built for raw range on customer …

Source web link

When versions scale faster than you expect, chaos follows.

Mistake 1: Pandas All over

Leave a Reply Cancel reply