Home Page

This page contains links to the latest (and most complete) revision of each work.


We show that all non-negative submodular functions have high noise-stability. As a consequence, we obtain a polynomial-time learning algorithm for this class with respect to any product distribution on {-1,1}n (for any constant accuracy parameter ε).  Our algorithm also succeeds in the agnostic setting.  Previous work on learning submodular functions required either query access or strong assumptions about the types of submodular functions to be learned (and did not hold in the agnostic setting).   Additionally we give simple algorithms that efficiently release differentially private answers to all Boolean conjunctions and to all halfspaces with constant average error, subsuming and improving recent work due to Gupta, Hardt, Roth and Ullman (STOC 2011).
We review connections between coding-theoretic objects and sparse learning problems. In particular, we show how seemingly different combinatorial objects such as error-correcting codes, combinatorial designs, spherical codes, compressed sensing matrices and group testing designs can be obtained from one another. The reductions enable one to translate upper and lower bounds on the parameters attainable by one object to another. We survey some of the well-known reductions in a unified presentation, and bring some existing gaps to attention. New reductions are also introduced; in particular, we bring up the notion of minimum "L-wise distance" of codes and show that this notion closely captures the combinatorial structure of RIP-2 matrices. Moreover, we show how this weaker variation of the minimum distance is related to combinatorial list-decoding properties of codes.
The rapid development of derandomization theory, which is a fundamental area in theoretical computer science, has recently led to many surprising applications outside its initial intention. We will review some recent such developments related to combinatorial group testing. In its most basic setting, the aim of group testing is to identify a set of "positive" individuals in a population of items by taking groups of items and asking whether there is a positive in each group.
In particular, we will discuss explicit constructions of optimal or nearly-optimal group testing schemes using "randomness-conducting" functions. Among such developments are constructions of error-correcting group testing schemes using randomness extractors and condensers, as well as threshold group testing schemes from lossless condensers.
We study constraint satisfaction problems with constraints defined by homogeneous linear threshold predicates. Specifically, we consider the standard optimization problem Max-CSP(P) where the objective is to satisfy as many constraints of the same type P on a number of Boolean variables as possible. We assume that the constraints are defined by a fixed linear threshold predicate P: {-1,+1}n → {-1,+1} of the form P(x1, ..., xn) = sgn(w1 x1 + ... + wn xn), for positive integer weights wi (i=1, ..., n). Our focus is on a range of threshold predicates for which the problem does not become approximation resistant, and we study the approximation curve of this class of problems. For the special case of the majority function with w1 = ... = wn = 1 and n odd, we obtain almost-matching approximability and hardness results that can be summarized as follows:
We extend the above results to a more general class of "majority-like" predicates and obtain parallel results for them. Loosely speaking, this class of predicates can be defined using weights wi that are generally small. Towards this, we introduce the notion of Chow-robustness that might be of independent interest.
The basic goal in combinatorial group testing is to identify a set of up to d defective items within a large population of size n >> d using a pooling strategy. Namely, the items can be grouped together in pools, and a single measurement would reveal whether there are one or more defectives in the pool. The threshold model is a generalization of this idea where a measurement returns positive if the number of defectives in the pool passes a fixed threshold u, negative if this number is below a fixed lower threshold Lu , and may behave arbitrarily otherwise. We study non-adaptive threshold group testing (in a possibly noisy setting) and show that, for this problem, O(dg+2 (log d) log(n/d)) measurements (where g := u - L) suffice to identify the defectives, and also present almost matching lower bounds. This significantly improves the previously known (non-constructive) upper bound O(du+1 log(n/d)). Moreover, we obtain a framework for explicit construction of measurement schemes using lossless condensers. The number of measurements resulting from this scheme is ideally bounded by O(dg+3 (log d) log n). Using state-of-the-art constructions of lossless condensers, however, we come up with explicit testing schemes with O(dg+3 (log d) quasipoly(log n)) and O(dg+3+β poly(log n)) measurements, for arbitrary constant β > 0. 
Non-adaptive group testing involves grouping arbitrary subsets of n items into different pools. Each pool is then tested and defective items are identified. A fundamental question involves minimizing the number of pools required to identify at most d defective items. Motivated by applications in network tomography, sensor networks and infection propagation we formulate group testing problems on graphs. Unlike conventional group testing problems each group here must conform to the constraints imposed by a graph. For instance, items can be associated with vertices and each pool is any set of nodes that must be path connected. In this paper we associate a test with a random walk. In this context conventional group testing corresponds to the special case of a complete graph on n vertices.
For interesting classes of graphs we arrive at a rather surprising result, namely, that the number of tests required to identify d defective items is substantially similar to that required in conventional group testing problems, where no such constraints on pooling is imposed. Specifically, if T(n) corresponds to the mixing time of the graph G, we show that with m=O(d2 T2(n) log(n/d)) non-adaptive tests, one can identify the defective items. Consequently, for the Erdos-Renyi random graph G(n,p), as well as expander graphs with constant spectral gap, it follows that
m=O(d2 log3(n)) non-adaptive tests are sufficient to identify d defective items. We next consider a specific scenario that arises in network tomography and show that m=O(d3 log3(n)) non-adaptive tests are sufficient to identify d defective items. We also consider noisy counterparts of the graph constrained group testing problem and develop parallel results for these cases.
Detection of defective members of large populations has been widely studied in the statistics community under the name "group testing", a problem which dates back to World War II when it was suggested for syphilis screening. There the main interest is to identify a small number of infected people among a large population using collective samples. In viral epidemics, one way to acquire collective samples is by sending agents inside the population. While in classical group testing, it is assumed that the sampling procedure is fully known to the reconstruction algorithm, in this work we assume that the decoder possesses only partial knowledge about the sampling process. This assumption is justified  by observing the fact that in a viral sickness, there is a chance that an agent remains healthy despite having contact with an infected person. Therefore, the reconstruction method has to cope with two different types of uncertainty; namely, identification of the infected population and the partially unknown sampling procedure.
In this work, by using a natural probabilistic model for "viral infections", we design non-adaptive sampling procedures that allow successful identification of the infected population with overwhelming probability 1-o(1). We propose both probabilistic and explicit design procedures that require a "small" number of agents to single out the infected individuals. More precisely, for a contamination probability p, the number of agents required by the probabilistic and explicit designs for identification of up to k infected members is bounded by m = O(k2 (log n) / p3) and m = O(k2 (log2n) / p3), respectively.  In both cases, a simple decoder is able to successfully identify the infected population in time O(mn).
We study combinatorial group testing schemes for learning d-sparse boolean vectors using highly unreliable disjunctive measurements. We consider an adversarial noise model that only limits the number of false observations, and show that any noise-resilient scheme in this model can only approximately reconstruct the sparse vector.  On the positive side, we give a general framework for construction of highly noise-resilient group testing schemes using randomness condensers. Simple randomized instantiations of this construction give non-adaptive measurement schemes, with m=O(d log n) measurements,  that allow efficient reconstruction of d-sparse vectors up to O(d) false positives even in the presence of delta*m false positives and Omega(m/d) false negatives within the measurement outcomes, for any constant delta < 1.  None of these parameters can be substantially improved without dramatically affecting the others. Furthermore, we obtain several explicit (and incomparable) constructions, in particular one matching the randomized trade-off but using m = O(d1+o(1) log n) measurements. We also obtain explicit constructions that allow fast reconstruction in time poly(m), which would be sublinear in n for sufficiently sparse vectors.
We give a general framework for construction of small ensembles of capacity achieving linear codes for a wide range of (not necessarily memoryless) discrete symmetric channels, and in particular, the binary erasure and symmetric channels.  The main tool used in our constructions is the notion of randomness extractors and lossless condensers that are regarded as central tools in theoretical computer science.  Same as random codes, the resulting ensembles preserve their capacity achieving properties under any change of basis.  Our methods can potentially lead to polynomial-sized ensembles; however, using known explicit constructions of randomness conductors we obtain specific ensembles whose size is as small as quasipolynomial in the block length.  By applying our construction to Justesen's concatenation scheme (Justesen, 1972) we obtain explicit capacity achieving codes for BEC (resp., BSC) with almost linear time encoding and almost linear time (resp., quadratic time)  decoding and exponentially small error probability. The explicit code for BEC is defined and capacity achieving for every block length.
A wiretap protocol is a pair of randomized encoding and decoding functions such that knowledge of a bounded fraction of the encoding of a message reveals essentially no information about the message, while knowledge of the entire encoding reveals the message using the decoder.  In this paper we study the notion of efficiently invertible extractors and show that a wiretap protocol can be constructed from such an extractor.  We will then construct invertible extractors for symbol-fixing, affine, and general sources and apply them to create wiretap protocols with asymptotically optimal trade-offs between their rate (ratio of the length of the message versus its encoding) and resilience (ratio of the observed positions of the encoding and the length of the encoding).  We will then apply our results to create wiretap protocols for challenging communication problems, such as active intruders who change portions of the encoding, network coding, and intruders observing arbitrary boolean functions of the encoding.

This paper studies the stability of some reconstruction algorithms  for compressed sensing in terms of the bit precision. Considering the fact that practical digital systems  deal with discretized signals, we motivate the importance of the total number of accurate bits needed from the measurement outcomes in addition to the number of measurements.  It is shown that if one uses a 2k*n  Vandermonde matrix with roots on the unit circle as the measurement matrix, O(L + k log(n/k)) bits of precision per measurement are sufficient to reconstruct a k-sparse signal x \in Rn with dynamic range (i.e., the absolute ratio between the largest and the smallest nonzero coefficients) at most 2L within L bits of precision, hence identifying its correct support. Finally, we obtain an upper bound on the total number of required bits when the measurement matrix satisfies a restricted isometry property, which is in particular the case for random Fourier and Gaussian matrices.  For very sparse signals, the upper bound on the number of required bits for Vandermonde matrices is shown to be better than this general upper bound.
We consider the problem of uniform sampling of points on an algebraic variety.  Specifically, we develop a randomized algorithm that, given a small set of multivariate polynomials over a sufficiently large finite field, produces a common zero of the polynomials almost uniformly at random. The statistical distance between the output distribution of the algorithm and the uniform distribution on the set of common zeros is polynomially small in the field size, and the running time of the algorithm is polynomial in the description of the polynomials and their degrees provided that the number of the polynomials is a constant.
The rigidity function of a matrix is defined as the minimum number of its entries that need to be changed in order to reduce the rank of the matrix to below a given parameter. Proving a strong enough lower bound on the rigidity of a matrix implies a nontrivial lower bound on the complexity of any linear circuit computing the set of linear forms associated with it. However, although it is shown that most matrices are rigid enough, no explicit construction of a rigid family of matrices is known. In this survey report we review the concept of rigidity and some of its interesting variations as well as several notable results related to that. We also show the existence of highly rigid matrices constructed by evaluation of bivariate polynomials over finite fields.
Randomized techniques play a fundamental role in theoretical computer science and discrete mathematics, in particular for the design of efficient algorithms and construction of combinatorial objects. The basic goal in derandomization theory is to eliminate or reduce the need for randomness in such randomized constructions.  Towards this goal, numerous fundamental notions have been developed to provide a unified framework for approaching various derandomization problems and to improve our general understanding of the power of randomness in computation.  Two important classes of such tools are pseudorandom generators and randomness  extractors. Pseudorandom generators transform a short, purely random, sequence into a much longer sequence that looks random, while extractors transform a weak source of randomness into a perfectly random one (or one with much better qualities, in which case the transformation is called a randomness condenser).

In this thesis, we explore some applications of the fundamental notions in derandomization theory to problems outside the core of theoretical computer science, and in particular, certain problems related to coding theory.  First, we consider the wiretap channel problem which involves a communication system in which an intruder can eavesdrop a limited portion of the transmissions. We utilize randomness extractors to construct efficient and information-theoretically optimal communication protocols for this model.

Then we consider the combinatorial group testing problem. In this classical problem, one aims to determine a set of defective items within a large population by asking a number of queries, where each query reveals whether a defective item is present within a specified group of items. We use randomness condensers to explicitly construct optimal, or nearly optimal, group testing schemes for a setting where the query outcomes can be highly unreliable, as well as the threshold model where a query returns positive if the number of defectives pass a certain threshold.

Next, we use randomness condensers and extractors to design ensembles of error-correcting codes that achieve the information-theoretic capacity of a large class of communication channels, and then use the obtained ensembles for construction of explicit capacity achieving codes. Finally, we consider the problem of explicit construction of error-correcting codes on the Gilbert-Varshamov bound and extend the original idea of Nisan and Wigderson to obtain a small ensemble of codes, mostly achieving the bound, under suitable computational hardness assumptions. 

Error correcting codes are combinatorial objects that allow reliable recovery of information in presence of errors by cleverly augmenting the original information with a certain amount of redundancy.
The availability of efficient means of error detection is considered as a fundamental criterion for error correcting codes. Locally testable codes are families of error correcting codes that are highly robust against transmission errors and in addition provide super-efficient (sublinear time) probabilistic algorithms for error detection. In particular, the error detection algorithm probes the received sequence only at a small (or even constant) number of locations.
There seems to be a trade-off between the amount of redundancy and the number of probes for the error detection procedure in locally testable codes. Even though currently best constructions allow reduction of redundancy to a nearly linear amount, it is not clear whether this can be further reduced to linear while preserving a constant number of probes.
We study the formal notion of locally testable codes and survey several major results in this area. We also investigate closely related concepts, and in particular, polynomial low-degree tests and probabilistically checkable proofs.
(implementation in ANSI C++, platform independent but needs OpenGL and glut)

Last Updated on October 2011

Home Page