Regis Ransomware

Summary

Only a few days before the Fall semester was to start in August 2019, Regis University was suddenly attacked with ransomware. IT panicked, and unplugged everything. The wireless internet was down campus-wide. Email was out. The website was down, phones were off, and even parking payment stations were shut off. And new Freshmen were on their way to campus to move in the next day. It was a total cyber disaster.

Here, I’ll describe why and how the attack probably happened. I’ll talk about why Regis and many other universities will almost certainly be hacked again. I’ll also provide suggestions for easy precautions organizations can take to prevent ransomware attacks.

Read more

TensorFlow functions with Keras

Summary

Recently, I was trying to use Cohen’s Kappa as a metric with Keras. I decided I would use the TensorFlow contrib function that already existed. While trying to get TensorFlow working with Keras, I discovered there were no easily-findable documents describing how to do this. The example from Keras’ blog is a few years old, and wasn’t working anymore. So after figuring out how to get TensorFlow working with Keras, I decided to document it (for the children).

Why use TensorFlow with Keras? TF, particularly the contrib portion, has many functions that are not available within Keras’ backend. Ideally you’d want to use Keras’ backend for things like TF functions, but for creating custom loss functions, metrics, or other custom code, it can be nice to use TF’s codebase.

Read more

Sentiment analysis

Summary

Maybe you want to see how people are responding to a story or blog of yours – do they love it or hate it; does it make them excited? Maybe you want to detect the tone of people mentioning your company or product. Or perhaps you want to keep tabs on the mood of stocks in order to make money. We can use sentiment analysis for all of these things.

In this multi-part series, we will look at different methods of sentiment and emotion analysis in both Python and R. We will compare performance on a standard dataset, and also scrape our own live tweets for analysis. Finally, we will check performance on stock-related text snippets from news headlines and stocktwits.

Read more

Cannabis recommender and science

Summary

I scraped all of leafly.com’s reviews, as well as 20,000 chemistry measurements of cannabis products. I used the reviews to make a collaborative recommendation engine, as well as a similar-strain recommender (still in progress). While studying the reviews and chemistry, I found that the reviews and chemistry data tend to group best into 3 groups. Two of the groups are similar, but one is a high-CBD group that is talked about a lot for pain and anxiety.

The app was live at cannadvise.me, but I took it down so I wouldn’t have to fund it. However, there’s a video overview here, and the GitHub code is here.

Read more

Doing data science => Grupo Bimbo deliveries

Summary

Learn how to use R to explore medium-big, real-world data sets, and how to train an XGBoost model to predict Grupo Bimbo’s sales volume for individual customers.

Read more