• IT Specialist
  • Who I am
  • Blog

Kopfknacker

~ breaking my head...

Tag Archives: python

Analyzing DATA – Pandas, Python and Star Trek: The Next Generation

11 Montag Jan 2016

Posted by Christoph Diefenthal in Data Analytics, Technologie

≈ 1 Comment

Tags

data analysis, ipython, numpy, pandas, python

In the last couple of month I worked on getting my head around Numpy, Python and Pandas. Before I get into the technical challenges and talk about the steep learning curve in a following blog post – it is first frustrating but than en-lighting :-) – I need to show some results first!

I thought, before I am the 1 millionth person working on a Kaggle project, I try to get my own data set to play with…
So I came up with the idea of analyzing the transcripts of STNG. I did not have to google very long and I found some nice looking transcripts at chakoteya.net. I did some web scraping to download all the text files, and put them into a Pandas DataFrame.

Thanks to the author of the transcripts: I had to do a little data cleaning – some misspellings here, deleting some line breaks there… But there wasn’t much necessary. It’s pretty good quality!

Long story short: Have a look! Here are some examples:

The „line-pie“: the distribution of spoken lines for the 25 characters with the most spoken lines in STNG:

Picard hat obviously a lot to say…

the lines-pie

 

The number of episodes a character had the most lines in:

Picard not suprisingly dominated 76 episodes. But who was K’EHLEYR again ??

PICARD       76 episodes
DATA         20
RIKER        16
LAFORGE      10
CRUSHER       9
WORF          8
TROI          4
LWAXANA       3
BARCLAY       2
WESLEY        2
K'EHLEYR      2 (who was that again??)
CLARA         1
CLEMENS       1
CONOR         1
JEV           1
ARMUS         1
DURKEN        1
FAJO          1
AMANDA        1
JAMESON       1
JELLICO       1
MADRED        1
MARR          1
OKONA         1
PICARD JR     1
Q             1
RAL           1
RASMUSSEN     1
RIKER 2       1
RO            1
SALIA         1
SCOTT         1
SITO          1
SPOCK         1
ALKAR         1

Picard lost his words

In the last 50 episodes Picard had more episodes with far less spoken lines than average.

picard-had-more-episodes-with-fewer-lines-in-the-end

The „Crusher-Pulaski-Gap“ – Episodes 26 to 47:

And she was never seen afterwards?

the crusher-pulaski-gap

 

„The Timescape“-Epsiode

The one where three main characters at once talk highly over average:

the-timescape-episode

What you always wanted to know about Wesley

evaluating-wesley

And more, and more and more diagrams and insights

in the IPython notebook  github.com/…/startrekng-episodes-analysis_02.ipynb

Have fun! Any feedback is more than welcome!

 

 

Categories

  • Artificial Intelligence
  • Data Analytics
  • Innovation
  • Leadership
  • Learning
  • Motivation
  • Organisation
  • Philosophical
  • Technologie
  • Uncategorized
  • User Interface

Tags

3D 3D Drucker AI anfänger artificial intelligence aufmerksamkeit begreifen biblionetz blog deeplearning delegieren denkfehler dueck early adopters erfindung erwartungen führung gedanken gelassenheit hüther innovation intelligenz ki konstruktivismus konzepte lernen machinelearning motivation multitouch organisieren programmieren real schreiben sinek software softwareentwicklung statistik thebrain triz vertrauen virtuell wahrheit wissen worte zukunft

Last Posts

  • Auf zu neuen Welten
  • Neural Network – really easy explained – I mean: really!
  • Analysing DATA2 – Star Trek and Predict Who Said What via Multinomial Naive Bayes
  • Analyzing DATA – Pandas, Python and Star Trek: The Next Generation
  • No one can tell you you can’t learn about yourself!

Archive

  • September 2018
  • März 2016
  • Januar 2016
  • Oktober 2015
  • August 2015
  • Juni 2015
  • Februar 2015
  • Januar 2015
  • Dezember 2014
  • November 2014
  • September 2014
  • August 2014
  • Juli 2014
  • Juni 2014
  • März 2014
  • Februar 2014
  • Januar 2014
  • November 2013
  • Oktober 2013
  • August 2013
  • Juli 2013
  • Juni 2013
  • Mai 2013

Meta

  • Anmelden
  • Feed der Einträge
  • Kommentare-Feed
  • WordPress.org

Tweets

Meine Tweets

Proudly powered by WordPress Theme: Chateau by Ignacio Ricci.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.

Notwendig immer aktiv

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Nicht notwendig

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.