Grigory Yaroslavtsev



I am an assistant professor at the Computer Science Department at Indiana University, Bloomington and a founding director of the Center for Algorithms and Machine Learning (CAML). I also hold an adjunct faculty appointment at the Department of Statistics.

Use this link to apply to our Ph.D. program. Check this ad and please take a look at my papers prior to contacting me about Ph.D. positions in my group. Internships are available for selected international undergraduate students. See also my diversity statement.

I am working on efficient algorithms for scalable data analysis, machine learning and data science including:

  • Approximation, distributed, streaming and online algorithms
  • Learning theory and property testing
  • Communication and information complexity
  • Data privacy and targeted alternatives to bulk data collection

Currently I am working with:

I serve(d) on the program committee (as a reviewer) for the following machine learning conferences: NeurIPS ('19, '18, '17, '16), ICML ('19, '18), ICLR ('19, '18), AISTATS ('19), UAI ('19). Recent theoretical computer science program committees I serve(d) on: APPROX 18, BeyondMR 18, COCOON 17, SODA 17, ESA 16, SOFSEM 15.

Recently created/taught/organized:



News


Papers (authors listed in alphabetical order unless otherwise specified)

Selected Publications

Other Publications

  • Optimality of Linear Sketching under Modular Updates

    K. Hosseini, S. Lovett, G. Yaroslavtsev

    CCC 2019 (34th Conference on Computational Complexity).

  • Adversarially Robust Submodular Maximization under Knapsack Constraints

    D. Avdiukhin, S. Mitrovic, G. Yaroslavtsev, S. Zhou

    KDD 2019, Research track (25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining). Oral presentation (9.2% acceptance rate)

  • Maximum Matchings in Dynamic Graph Streams and the Simultaneous Communication Model

    S. Assadi, S. Khanna, Y. Li, G. Yaroslavtsev

    SODA 2016 (27th Annual ACM-SIAM Symposium on Discrete Algorithms).

  • Amplification of One-Way Information Complexity via Codes and Noise Sensitivity

    M. Molinaro, D. Woodruff, G. Yaroslavtsev

    ICALP 2015, Track A (42nd International Colloquium on Automata, Languages and Programming).

  • Certifying Equality with Limited Interaction

    J. Brody, A. Chakrabarti, R. Kondapally, D. Woodruff, G. Yaroslavtsev

    RANDOM 2014 (18th International Workshop on Randomization and Computation).

    • Full version in the special issue of Algorithmica on “Information Complexity and Applications”

  • Beyond Set Disjointness: The Communication Complexity of Finding the Intersection

    J. Brody, A. Chakrabarti, R. Kondapally, D. Woodruff, G. Yaroslavtsev

    PODC 2014 (33rd Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing).

  • Lower Bounds for Testing Properties of Functions over Hypergrid Domains

    E. Blais, S. Raskhodnikova, G. Yaroslavtsev

    CCC 2014 (29th IEEE Conference on Computational Complexity).

  • Beating the Direct Sum Theorem in Communication Complexity with Applications to Sketching

    M. Molinaro, D. Woodruff, G. Yaroslavtsev

    SODA 2013 (24th Annual ACM-SIAM Symposium on Discrete Algorithms).

  • Learning Pseudo-Boolean k-DNF and Submodular Functions

    S. Raskhodnikova, G. Yaroslavtsev

    SODA 2013 (24th Annual ACM-SIAM Symposium on Discrete Algorithms).

  • Accurate and Efficient Private Release of Datacubes and Contingency Tables

    G. Yaroslavtsev*, G. Cormode, C. M. Procopiuc, D. Srivastava

    ICDE 2013 (29th IEEE International Conference on Data Engineering).

  • Primal-Dual Algorithms for Node-Weighted Network Design in Planar Graphs

    P. Berman, G. Yaroslavtsev

    APPROX 2012 (15th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems).

* Indicates papers with non-alphabetical ordering of authors.

Preprints

  • Approximate F2-Sketching of Valuation Functions.

    G. Yaroslavtsev, S. Zhou

  • Going for Speed: Sublinear Algorithms for Dense r-CSPs

    G. Yaroslavtsev

Talks

  • Advances in Hierarchical Clustering of Vector Data [ Video from Northwestern (link)] [Slides: (pptx), (pdf)]
    • University of Warwick, Warwick, UK. Discrete Mathematics and Applications Seminar. June 06, 2019.
    • University of Oxford, Oxford, UK. Algorithms and Complexity Seminar. May 30, 2019.
    • Facebook Core Data Science, Menlo Park, CA. March 11, 2019.
    • Johns Hopkins University, Baltimore, MD. Algorithms and Complexity Seminar. March 06, 2019.
    • Northwestern University, Evanston, IL. Computer Science Seminar. March 01, 2019.
  • Advances in Linear Sketching over Finite Fields [ Video from Simons (link)][Slides: (pptx), (pdf)]
  • Badger Rampage: Multidimensional Balanced Partitioning of Facebook-scale Graphs [Slides: (pptx), (pdf)]
  • Massively Parallel Algorithms and Hardness for Single-Linkage Clustering Under ℓp-Distances [ Video from ICML'18 (link)] [Slides: (pptx), (pdf)]
    • IBM Research Almaden, San Jose, CA. Theory Seminar. August 07, 2018.
    • 35th International Conference on Machine Learning (ICML'18), Stockholm, Sweden. July 12, 2018.
    • Stanford University, Stanford, CA. Theory Seminar. May 17, 2018.
    • University of Warwick, Warwick, UK. “Workshop on Data Summarization”. March 20, 2018.
  • Linear Sketching for Functions over Boolean Hypercube [Slides: (pptx), (pdf)]
    • University of Michigan, Ann Arbor, MI. Theory Seminar. September 14, 2018.
    • Toyota Technologicial Insitute. 68th Midwest Theory Day. Chicago, IL. April 13, 2018.
    • 58th Annual Symposium on Foundations of Computer Science (FOCS 2017). Workshop “Linear Sketching as a Tool for Everything”, Berkeley, CA. October 14, 2017.
  • Clustering on Clusters 2049: Massively Parallel Algorithms for Clustering Graphs and Vectors [Slides: (pptx), (pdf)]
    • Facebook, Menlo Park, CA. Tech Talk. October 13, 2017.
  • Computational and Communication Complexity In Massively Parallel Computing [Slides: (pptx), (pdf)]
    • ITMO University, St. Petersburg, Russia. Departmental Colloquium. June 15, 2017.
    • Higher School of Economics, Moscow, Russia. Workshop on Complexity of Computation, Communication, Descriptions and Proofs. June 14, 2017.
  • Clustering on Clusters: Massively Parallel Algorithms for Clustering Graphs and Vectors [Slides: (pptx), (pdf)]
    • Facebook, Menlo Park, CA. Tech Talk. February 09, 2017.
  • Linear Sketching with Parities [ Video from Banff (link)][Slides: (pptx), (pdf)]
    • 33rd Conference on Computational Complexity (CCC'18), San Diego, CA. June 22, 2018.
    • St. Petersburg Department of Steklov Institute of Mathematics of the Russian Academy of Sciences. Theory Seminar. June 02, 2017.
    • Moscow State University, Moscow, Russia. Kolmogorov Seminar on Theoretical Computer Science. May 22, 2017.
    • Banff International Research Station, Banff, Canada. Workshop on Communication Complexity and Applications. March 20, 2017.
    • Columbia University, New York, NY. Theory Seminar. November 21, 2016.
    • University of Pennsylvania, Philadelphia, PA. Theory Seminar. October 21, 2016.
    • University of Utah, Salt Lake City, UT. Theory Seminar. September 09, 2016.
    • University of Illinois. Urbana, IL. Theory Seminar. August 12, 2016.
    • Microsoft Research. Redmond, WA. Theory Seminar. June 29, 2016.
  • What's New in “The Big Data Theory”?
    • Drexel University, Philadelphia, PA. Departmental Colloquium. March 09, 2016.
    • Boston University, Boston, MA. Departmental Colloquium. February 29, 2016.
    • University of Colorado, Boulder, CO. Departmental Colloquium. February 25, 2016.
    • Indiana University, Bloomington, IN. Departmental Colloquium. February 22, 2016.
    • Georgetown University, Washington, DC. Departmental Colloquium. February 18, 2016.
    • College of William and Mary, Williamsburg, VA. Departmental Colloquium. February 08, 2016.
  • Fast Fourier Sparsity Testing over the Boolean Hypercube [Slides: (pptx), (pdf)]
    • University of Wisconsin, Madison. Theory Seminar. August 06, 2015.
  • Near Optimal LP Rounding for Correlation Clustering [ Video from MSR (link)] [Slides: (pptx), (pdf)]
    • Cornell University, Ithaca, NY. Theory Seminar. May 04, 2015.
    • MIT, Boston, MA. Algorithms and Complexity Seminar. April 09, 2015.
    • Microsoft Research, Redmond, WA. March 12, 2015.
    • Google Research, NYC. Google Tech Talk. February 17, 2015.
    • Rutgers University, New Brunswick, NJ. Theory Seminar. January 28, 2015.
    • Carnegie Mellon University, Pittsburgh, PA. Theory Lunch. January 21, 2015.
    • Pennsylvania State University, State College, PA. Departmental colloquium. January 20, 2015.
  • Lower Bounds for Testing Properties of Functions over Hypergrids [Slides: (pptx), (pdf)]
    • 29th IEEE Conference on Computational Complexity (CCC 2014), Vancouver, BC. June 13, 2014.
  • Beyond Set Disjointness: the Communication Complexity of Finding the Intersection [Slides: (pptx), (pdf)]
    • 33rd Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC 2014), Paris, France.
    • MIT, Boston, MA. Theory of Distributed Systems Seminar. May 16, 2014.
  • “The Big Data Theory” and Randomized Algorithms
    • Georgia Tech, Atlanta, GA. March 05, 2014.
  • Approximating Graph Problems: The Old and The New
    • Yahoo! Research, NYC. February 25, 2014.
    • MIT, Boston, MA. Algorithms and Complexity Seminar. February 19, 2014.
    • Toyota Technological Institute, Chicago, IL. February 17, 2014.
    • Brown University, Providence, RI. ICERM Theory Seminar. January 31, 2014.
  • Lp-Testing [Slides: (pptm), (pdf))]
    • Johns Hopkins University, Sublinear Algorithms Workshop. January 08, 2016.
    • Columbia University, Theory seminar. October 24, 2014.
    • 46th ACM Symposium on Theory of Computing (STOC 2014). June 01, 2014.
    • Microsoft Research, Redmond, Theory lunch. January 08, 2014.
    • Harvard University, Theory seminar. November 12, 2013.
    • Brown University, Providence, RI. Theory seminar. November 1, 2013.
    • IBM Almaden Research Center, San Jose, CA. Theory seminar. October 25, 2013.
  • Property Testing and Communication Complexity [Slides: (pptx), (pdf)]
    • MIT, Boston, MA. Algorithms and Complexity Seminar. September 11, 2013.
  • Accurate and Efficient Private Release of Datacubes and Contingency Tables
    • Cornell University, CDI project meeting. May 07, 2013. [Slides: (pptx), (pdf)]
    • 29th IEEE Inernational Conference on Data Engineering (ICDE 2013). April 10, 2013. [Slides: (pdf)]
  • Beating the Direct Sum Theorem in Communication Compelxity
    • Aarhus University, Denmark. Theory seminar. May 22, 2013.
    • MIT, Boston, MA. Algorithms and Complexity seminar. December 13, 2012.
    • Princeton University, Princeton, NJ. Theory lunch. November 16, 2012.
  • Overlapping Clustering with Qualitative Information
    • 53rd IEEE Symposium on Foundations of Computer Science (FOCS 2012). Poster session. October 22, 2012.
  • Parallel Algorithms for Geometric Problems [Slides: (pptx), (pdf)]
    • 22nd International Symposium on Mathematical Programming (ISMP 2015). July 15, 2015.
    • Johns Hopkins University, Baltimore, MD. Algoritms and Complexity Seminar. November 19, 2014.
    • University of Maryland, College Park, MD. Capital Area Theory Seminar. October 30, 2014.
    • University of Pennsylvania, Philadelphia, PA. Theory Seminar. August 29, 2014.
    • University of Massachusetts, Amherst, MA. Theory Seminar. May 19, 2014.
    • Google Research, NYC. Google Tech Talk. April 04, 2014.
    • Sandia Labs, Livermore, CA. March 27, 2014.
    • Stanford University, Stanford, CA. March 25, 2014.
    • Microsoft Research, SVC, Mountain View, CA. Lab meeting. October 17, 2012.
  • Learning and Testing Submodular Functions [ Video from MSR (link)][Slides: (pptx), (pdf)]
    • Microsoft Research, Redmond. Theory seminar. June 11, 2013.
    • University of Melbourne, Australia. Theory seminar. April 19, 2013.
    • UCLA, Los Angeles, CA. Theory seminar. February 04, 2013.
    • 24th ACM-SIAM Symposium on Discrete Algorithms (SODA 2013). January 08, 2013.
    • Weizmann Institute of Science, Rehovot, Israel. December 30, 2012.
    • Harvard University, Boston, MA. Theory of Computing seminar. December 10, 2012.
    • Carnegie Mellon University, Pittsburgh, PA. Theory Lunch. December 05, 2012.
    • Carnegie Mellon University, Pittsburgh, PA. Tepper School of Business, Operations Research Seminar. December 07, 2012.
    • New York Computer Science and Economics Day 2012, Poster session. December 3, 2012.
    • IBM T.J. Watson Research Cetner, Yorktown Heights, NY. IP for Lunch. November 14, 2012.
    • Columbia University, NYC. Theory seminar. October 26, 2012.
    • 53rd IEEE Symposium on Foundations of Computer Science (FOCS 2012). Poster session. October 22, 2012.
    • Microsoft Research, Silicon Valley. Theory seminar. October 10, 2012.
    • EPFL, Lausanne, Switzerland. Algorithmic Frontiers Workshop, poster session. June 2012.
    • IBM Almaden Research Center, San Jose, CA. Theory seminar. May 2012.
    • 44th ACM Symposium on the Theory of Computing (STOC 2012). Poster session. May 2012.
  • Primal-dual Algorithms for Node-Weighted Network Design in Planar Graphs [Slides: (pptx), (pdf)]
    • 15th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX 2012). August 2012.
  • Advances in Directed Spanners [Slides: (pdf)].
    • University of Sydney, Australia. Theory seminar. April 9, 2013.
    • Carnegie Mellon University, Theory Lunch, November 2011.
    • University of Maryland, Capital Area Theory Seminar, November 2011.
  • Private Analysis of Graph Structure [Slides: (pptx), (pdf)]
    • EPFL, Lausanne, Switzerland. Algorithmic Frontiers Workshop, poster session. June 2012. [Poster: (pdf)]
    • AT&T Labs --- Research, Florham Park, NJ. August 2011.
    • 37th International Conference on Very Large Data Bases (VLDB 2011), Research track. August 2011.
  • Improved Approximation for the Directed Spanner Problem [Slides: (pptx), (pdf)]
    • 38th International Colloquium on Automata, Languages and Programming (ICALP 2011), Track A. July 2011. [Slides: (pptx)]
    • AT&T Labs --- Research, Florham Park, NJ. Mathematics Research Colloquium and Informal Seminar. June 2011.
    • 43rd ACM Symposium on the Theory of Computing (STOC 2011). Poster session. June 2011. [Poster: (pdf)]
    • Moscow State University. Combinatorial optimization seminar. May 2011.
    • IBM T.J. Watson Research Center, Yorktown Heights, NY. IP for lunch. April 2011.
    • St. Petersburg Institute of Fine Mechanics and Optics. Theory seminar. December 2010.
  • Steiner Transitive-Closure Spanners of Low-Dimensional Posets [Slides: (ppsx), (pdf)]
    • 38th International Colloquium on Automata, Languages and Programming (ICALP 2011),Track A. July 2011.
  • Linear Bounds on Circuit Complexity and Feebly One-Way Permutations [Slides: (pdf)]
    • Pennsylvania State University. Theory seminar. April 2010.

Teaching

Personal

  • Algorithm competitions

    I participated in ACM ICPC and TopCoder competitions (as griffon) competing and setting problems in TopCoder Open Algorithms Finals.

  • Teaching algorithms for high-school students

    I was teaching advanced classes in algorithms for high-school students for ~5 years, coaching teams for algorithmic competitions. I participated in preparation of training camps and contests for Russian Olympiad in Informatics and International Olympiad in Informatics (both in Russia and in the U.S.).

  • Non-profit in education for high-school students: Homepage, Group on Vkontakte

    In 2009 I co-founded a non-profit organization focused on advanced extracurricular education in algorithms for high-school students (5 – 10 grades) in St. Petersburg, Russia. In 2013 it was expanded to Moscow and Yekaterinburg.

    In the early days we were supported by , and . Now donations are welcome!

  • Triathlon and Duathlon

    If I am not working then I am probably practicing for the next (more pictures). Let's do it together! ;)

    In 2017 I was representing Team USA in the M30-34 age group at the ITU Standard Distance Duathlon World Championship in Penticton, Canada. In 2019 I was representing Team USA in the M30-34 age group at the ITU Long Distance Triathlon World Championship in Pontevedra, Spain.

  • Software Engineering

    In 2007–2008 I was lucky to be a part of the FBReader team. This is an open source free e-Book Reader project.

    We developed the first version of FBReader for Android (now > 10M downloads worldwide on Google play).

  • Very nerdy!

    There are some things I can't prove but rather just believe in. E.g. this logo I designed and proposed for the CSTheory website.

  • Education

    St. Petersburg Academic University is a unique center for continuous education in physics and engineering, run by Zhores Alferov, a Nobel Prize winner in Physics. In 8 years there I finished high school, B.S. and M.S. (a pilot class in theoretical computer science where I was the first student). I am forever grateful to all my teachers during those happy years!

    Here is a recent video (in Russian) about the new bachelors programs at the Academic University.

Links


Graduate School and Academia

Diversity Statement

  • As a member of SIAM I support and follow its statement on inclusiveness.
  • I strongly support cultural diversity and my co-authors are from 16 countries: Belarus, Brazil, Bulgaria, Canada, China, Greece, India, Iran, Israel, Moldova, Poland, Romania, Russia, Serbia, United Kingdom and United States.