Grigory Yaroslavtsev bio photo

Grigory Yaroslavtsev

This blog will cover theoretical aspects of algorithm design for large data processing. You can subscribe via RSS.

Email Twitter Facebook Google+ LinkedIn Github

All Posts

2016

What's New in the Big Data Theory 2016

Collection of interesting papers on algorithms for big data from 2016.

The Binary Sketchman

This post discusses some of my recent work on linear compression for binary data.

Teaching “Foundations of Data Science”

Discussion of the class on Foundations of Data Science that I am teaching at IU this Fall.

ESA'16 Deadline Approaching

A quick announcement of the ALGO'16 symposium and the ESA'16 conference.

The Simple Economics of Algorithms for Big Data

In this blog post I want to suggest a simple reason why you should study your algorithms really well if you want to design algorithms that deal with big data.

2015

Teaching algorithms for Big Data

In this post I share my experience teaching a class on algorithms for big data at the University of Pennsylvania.

Slides and Videos from DIMACS

Slides and videos from the DIMACS workshop “Big Data through the Lens of Sublinear Algorithms” are now available.

East Coast Workshops

Two events that might be of interest to the readers of this blog are happening on the East Coast next week.

algorithms for Big Data

This is announcement of an upcoming class at Penn which I am going to teach in Fall'15.

Colocate, Colocate, Colocate

Adding my two cents to the discussion of the new format for STOC/FOCS conferences.

Upcoming Workshops

As promised in the New Year's post this year there are a lot of activities related to sublinear algorithms and big data.

Modern Algorithms or The Brave New O of the Big N

In this post I offer some suggestions about possible changes in the modern algorithms curriculum.

Models for Parallel Computation (Hitchhiker's Guide to Massively Parallel Universes)

In this post Sergei Vassilvitskii and I describe the most commonly used models for massively parallel computation and the relationships between them.

MapReduce and RDBMS: Practice and Theory

In this post I discuss differences between RDBMSs and MapReduce addressing Michael Stonebraker's criticism.

Sublinear Day at MIT

The second “Sublinear Algorithms and Big Data Day” will take place at MIT on April 10.

Happy Sublinear Year!

2014

Getting a Research Internship

In this post I share some advice and experience about getting a research internship at industrial research labs.

Massively Parallel Clustering: Overview

Clustering is one of the main vechicles of machine learning and data analysis. In this post I will introduce three algorithms for clustering massive data.

Penn Big Data Reading Group

This semester I am running a reading group on algorithms for big data at UPenn.

Model for Massively Parallel Computation

In this post I will introduce a theoretical model for computation in centralized distributed massively parallel computational systems (or in short clusters l...

About this blog

This blog will cover theoretical aspects of algorithm design for large data processing.