I am an assistant professor at the Computer Science Department at Indiana University, Bloomington and a founding director of the Center for Algorithms and Machine Learning (CAML). I also hold an adjunct faculty appointment at the Department of Statistics.
Use this link to apply to our Ph.D. program. Check this ad and please take a look at my papers prior to contacting me about Ph.D. positions in my group. Internships are available for selected international undergraduate students. See also my diversity statement. 
I am working on efficient algorithms for scalable data analysis including:
Currently I am working with:
Recent theoretical computer science program committees I serve(d) on: APPROX 18, BeyondMR 18, COCOON 17, SODA 17, ESA 16, SOFSEM 15. I also serve(d) on the program committee (as a reviewer) for the following machine learning conferences: NeurIPS ('19, '18, '17, '16), ICML ('19, '18), ICLR ('19, '18), AISTATS ('19), UAI ('19).
Recently created/taught/organized:
G. Yaroslavtsev^{*},
A. Vadapalli
Massively Parallel Algorithms and Hardness for SingleLinkage Clustering under ℓ_{p}Distances ICML 2018 (35th International Conference on Machine Learning). Long talk (8.6% acceptance rate). 
Code on Github (by Adithya Vadapalli).
S. Kannan,
E. Mossel,
S. Sanyal,
G. Yaroslavtsev
Linear Sketching over F_{2} [Preprint on ECCC]. CCC 2018 (33rd Conference on Computational Complexity). 
Blog post on “The Big Data Theory”.
M. Kearns,
A. Roth,
S. Wu,
G. Yaroslavtsev
Private Algorithms for the Protected in Social Network Search PNAS 2016 (Proceedings of the National Academy of Sciences), via direct submission. 
Spotlight story on the Penn front page: Balancing Privacy and Security in Network Analysis.
External media coverage: PBS Newshour, ACM Tech News / The Daily Pennsylvanian, Schneier on Security, AAU, Quartz, Pacific Standard, Wired (German), The Naked Scientists Podcast and Vice Motherboard.
S. Chawla,
K. Makarychev,
T. Schramm,
G. Yaroslavtsev
Near Optimal LP Rounding Algorithm for Correlation Clustering on Complete and Complete kpartite Graphs STOC 2015 (47th ACM Symposium on the Theory of Computing). 
See the Wikipedia article on correlation clustering for more details.
For applications in data mining see a Yahoo! KDD'14 tutorial by Francesco Bonchi, David GarciaSoriano and Edo Liberty.
P. Berman,
S. Raskhodnikova,
G. Yaroslavtsev
L_{p}Testing STOC 2014 (46th ACM Symposium on the Theory of Computing). 
Simple introduction by Sofya Raskhodnikova in the Encyclopedia of Algorithms (Springer).
Open problems from JHU workshop on Sublinear Algorithms [pptx, pdf].
Oded Goldreich's review on “My Choices”. Property Testing Review Post 1, Post 2.
A. Andoni,
A. Nikolov,
K. Onak,
G. Yaroslavtsev
Parallel Algorithms for Geometric Graph Problems STOC 2014 (46th ACM Symposium on the Theory of Computing). 
Notes by Thomas Steinke from Jelani Nelson's Algorithms for Big Data Course at Harvard
Notes by Kui Tang from Alex Andoni's Algorithmic Techniques for Massive Data Course at Columbia.
V. Karwa,
S. Raskhodnikova,
A. Smith,
G. Yaroslavtsev
Private Analysis of Graph Structure VLDB 2011, Research track (37th International Conference on Very Large Data Bases). 
Full version in ACM Transactions on Database Systems.
P. Berman,
A. Bhattacharyya,
K. Makarychev,
S. Raskhodnikova,
G. Yaroslavtsev
Approximation Algorithms for Spanner Problems and Directed Steiner Forest ICALP 2011, Track A (38th International Colloquium on Automata, Languages and Programming), special issue of “Information and Computation”. 
Runnerup for the Best Paper Award.
Open Problem #2 from the Princeton Workshop on Approximation Algorithms. See also my slides.
A. Kojevnikov,
A. Kulikov,
G. Yaroslavtsev
Finding Efficient Circuits Using SATsolvers SAT 2009 (12th International Conference on Theory and Applications of Satisfiability Testing).

MultiDimensional Balanced Graph Partitioning via Projected Gradient Descent.
VLDB 2019, Research track (45th International Conference on Very Large Data Bases).
Hierarchical Clustering for Euclidean Data
AISTATS 2019 (22nd International Conference on Artificial Intelligence and Statistics).
Amplification of OneWay Information Complexity via Codes and Noise Sensitivity
ICALP 2015, Track A (42nd International Colloquium on Automata, Languages and Programming).
Certifying Equality with Limited Interaction
RANDOM 2014 (18th International Workshop on Randomization and Computation).
Beyond Set Disjointness: The Communication Complexity of Finding the Intersection
PODC 2014 (33rd Annual ACM SIGACTSIGOPS Symposium on Principles of Distributed Computing).
Lower Bounds for Testing Properties of Functions over Hypergrid Domains
CCC 2014 (29th IEEE Conference on Computational Complexity).
Beating the Direct Sum Theorem in Communication Complexity with Applications to Sketching
SODA 2013 (24th Annual ACMSIAM Symposium on Discrete Algorithms).
Learning PseudoBoolean kDNF and Submodular Functions
SODA 2013 (24th Annual ACMSIAM Symposium on Discrete Algorithms).
Accurate and Efficient Private Release of Datacubes and Contingency Tables
ICDE 2013 (29th IEEE International Conference on Data Engineering).
PrimalDual Algorithms for NodeWeighted Network Design in Planar Graphs
APPROX 2012 (15th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems).
Steiner TransitiveClosure Spanners of dDimensional Posets
ICALP 2011, Track A (38th International Colloquium on Automata, Languages and Programming).
^{*} Indicates papers with nonalphabetical ordering of authors.
Optimality of Linear Sketching under Modular Updates.
Approximate F_{2}Sketching of Valuation Functions.
Online Algorithms for Machine Minimization [Preprint on arXiv].
I participated in ACM ICPC and TopCoder competitions (as griffon) competing and setting problems in TopCoder Open Algorithms Finals.
I was teaching advanced classes in algorithms for highschool students for ~5 years, coaching teams for algorithmic competitions. I participated in preparation of training camps and contests for Russian Olympiad in Informatics and International Olympiad in Informatics (both in Russia and in the U.S.).
In 2009 I cofounded a nonprofit organization focused on advanced extracurricular education in algorithms for highschool students (5 – 10 grades) in St. Petersburg, Russia. In 2013 it was expanded to Moscow and Yekaterinburg.
In the early days we were supported by , and . Now donations are welcome!
If I am not working then I am probably practicing for the next (more pictures). Let's do it together! ;)
In 2017 I was representing Team USA in the M3034 age group at the ITU Standard Distance Duathlon World Championship in Penticton, Canada. In 2019 I will be representing Team USA in the M3034 age group at the ITU Long Distance Triathlon World Championship in Pontevedra, Spain.
In 2007–2008 I was lucky to be a part of the FBReader team. This is an open source free eBook Reader project.
We developed the first version of FBReader for Android (now > 10M downloads worldwide on Google play).
There are some things I can't prove but rather just believe in. E.g. this logo I designed and proposed for the CSTheory website.
St. Petersburg Academic University is a unique center for continuous education in physics and engineering, run by Zhores Alferov, a Nobel Prize winner in Physics. In 8 years there I finished high school, B.S. and M.S. (a pilot class in theoretical computer science where I was the first student). I am forever grateful to all my teachers during those happy years!
Here is a recent video (in Russian) about the new bachelors programs at the Academic University.