Publications
Google Scholar lists papers in citation count order.
N. Parikh and A. Hohman. NYC Artificial Intelligence Strategy. City of New York, 2021.
The NYC AI Strategy is a foundational effort to foster a healthy cross-sector AI ecosystem in New York City. The document establishes a baseline of information about AI to help ensure decision-makers are working from an accurate and shared understanding of the technology and the issues it presents, outlines key components and characteristics of the local AI ecosystem today, and frames a set of areas of opportunity for City action.
N. Parikh and A. Hohman. NYC Artificial Intelligence Primer. City of New York, 2021.
Establishing a clear understanding of what AI is, how it works, and what some of the key practical and ethical considerations are around its use is foundational to building an effective AI strategy. The City of New York’s AI Primer aims to help provide this foundation, primarily for an audience of technical, policy, business, or other decision-makers. This document is intended to provide a reasonable working knowledge and overview of AI and machine learning for a broad but reasonably senior audience, as opposed to the general public. This document is not specific to NYC and is also well-suited for use in other governments or in corporate or academic contexts.
N. Parikh, G. Fahs, and B. Nonnecke. Test-Driven Development for Technology Policymaking. The Aspen Institute, 2019.
Technology policy frequently does not work as intended. Often, policies in this space do not fully address the issues they are concerned with or have negative unintended consequences. Developing software was also once as ad-hoc and error-prone until robust methods for testing and writing software were developed. This project recommends applying one such framework, test-driven development (TDD), to policymaking to help develop more robust policies that are more likely to accurately address their issue of concern. The method does not require the use of engineering tools or software; it is simply a methodology that focuses on brainstorming concrete situations in which to ‘test’ possible policy solutions.
B. O'Donoghue, E. Chu, N. Parikh, and S. Boyd. SCS (Splitting Conic Solver). GitHub, 2019.
SCS, the splitting conic solver, is a numerical optimization package for solving large-scale convex cone problems, based on our paper Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding. It is written in C and can be used in other C, C++, Python, Matlab, R, Julia, and Ruby programs via linked interfaces. It can also be called as a solver from convex optimization toolboxes CVX (3.0 or later), CVXPY, Convex.jl, and Yalmip.
B. O'Donoghue, E. Chu, N. Parikh, and S. Boyd. Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding. Journal of Optimization Theory and Applications, 2016.
We introduce a first order method for solving very large cone programs to modest accuracy. The method uses an operator splitting method, the alternating direction method of multipliers, to solve the homogeneous self-dual embedding, an equivalent feasibility problem involving finding a nonzero point in the intersection of a subspace and a cone.
N. Parikh and S. Boyd. Proximal Algorithms. Foundations and Trends in Optimization, 2014.
This monograph is about a class of optimization algorithms called proximal algorithms. Much like Newton’s method is a standard tool for solving unconstrained smooth optimization problems of modest size, proximal algorithms can be viewed as an analogous tool for nonsmooth, constrained, large-scale, or distributed versions of these problems. They are very generally applicable, but are especially well-suited to problems of substantial recent interest involving large or high-dimensional datasets. Proximal methods sit at a higher level of abstraction than classical algorithms like Newton’s method: the base operation is evaluating the proximal operator of a function, which itself involves solving a small convex optimization problem. These subproblems, which generalize the problem of projecting a point into a convex set, often admit closed-form solutions or can be solved very quickly with standard or simple specialized methods. Here, we discuss the many different interpretations of proximal operators and algorithms, describe their connections to many other topics in optimization and applied mathematics, survey some popular algorithms, and provide a large number of examples of proximal operators that commonly arise in practice.
N. Parikh and S. Boyd. Block Splitting for Distributed Optimization. Mathematical Programming Computation, 2014.
This paper describes a general purpose method for solving convex optimization problems in a distributed computing environment. In particular, if the problem data includes a large linear operator or matrix \(A\), the method allows for handling each subblock of \(A\) on a separate machine. The approach works as follows. First, we define a canonical problem form called graph form, in which we have two sets of variables \(x\) and \(y\) related by a linear operator \(A\), such that the objective function is separable across these two sets of variables. Many problems are easily expressed in graph form, including cone programs and a wide variety of regularized loss minimization problems from statistics, like logistic regression, the support vector machine, and the lasso. Next, we describe graph projection splitting, a form of Douglas-Rachford splitting or the alternating direction method of multipliers, to solve graph form problems serially. Finally, we derive a distributed block splitting algorithm based on graph projection splitting. In a statistical or machine learning context, this allows for training models exactly with a huge number of both training examples and features, such that each processor handles only a subset of both. To the best of our knowledge, this is the only general purpose method with this property. We present several numerical experiments in both the serial and distributed settings.
N. Parikh. Distributed Convex Optimization with Proximal Methods. Stanford University, 2014.
This thesis included work from the monographs “Proximal Algorithms” and “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers” along with some additional material from “Block Splitting for Distributed Optimization”.
E. Chu, N. Parikh, A. Domahidi, and S. Boyd. Code Generation for Embedded Second-Order Cone Programming. Proceedings of the European Control Conference, 2013.
This paper describes a framework for generating easily verifiable code to solve convex optimization problems in embedded applications by transforming them into equivalent second-order cone programs. In embedded applications, it is critical to be able to verify code correctness, but it is also desirable to be able to rapidly prototype and deploy high-performance solvers for different problems. To balance these two requirements, we propose a code generation system that takes high-level descriptions of convex optimization problems and generates code that maps the parameters in the original problem to data in an equivalent second-order cone program, which is then solved by a single, external solver that can be verified once and for all. A novel aspect is that we restrict the parameters in the original problem to only appear in affine functions, which lets us map the parameters to problem data without performing any floating point operations. As a result, the generated code is lightweight, fast, and trivial to verify. The approach thus marries the benefits of high-level parser/solvers with custom, high-performance, high-reliability solvers for embedded applications.
E. Chu, B. O'Donoghue, N. Parikh, and S. Boyd. A Primal-Dual Operator Splitting Method for Conic Optimization. Unfinished draft, 2013.
We develop a simple operator splitting method for solving a primal conic optimization problem; we show that the iterates also solve the dual problem. The resulting algorithm is very simple to describe and implement and yields solutions of modest accuracy in competitive times. Several versions of the algorithm are amenable to parallelization, either via distributed linear algebra or GPU-accelerated matrix-vector multiplication. We provide a simple, single-threaded C implementation for reference.
N. Parikh and S. Boyd. Block Splitting for Large-Scale Distributed Learning. Neural Information Processing Systems, Workshop on Big Learning, 2011.
Machine learning and statistics with very large datasets is now a topic of widespread interest, both in academia and industry. Many such tasks can be posed as convex optimization problems, so algorithms for distributed convex optimization serve as a powerful, general-purpose mechanism for training a wide class of models on datasets too large to process on a single machine. In previous work, it has been shown how to solve such problems in such a way that each machine only looks at either a subset of training examples or a subset of features. In this paper, we extend these algorithms by showing how to split problems by both examples and features simultaneously, which is necessary to deal with datasets that are very large in both dimensions. We present some experiments with these algorithms run on Amazon’s Elastic Compute Cloud.
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning, 2011.
Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features, training examples, or both. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this paper, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas-Rachford splitting, Spingarn’s method of partial inverses, Dykstra’s alternating projections, Bregman iterative algorithms for \(\ell_1\) problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.