Home Page
- Mahdi Cheraghchi, Johan Håstad, Marcus Isaksson, Ola Svensson.
Approximating Linear Threshold Predicates. In Proceedings of the 13th International
Workshop on Approximation Algorithms for Combinatorial Optimization
Problems (APPROX), 2010 (to appear).
We study constraint
satisfaction problems with constraints defined by homogeneous linear
threshold predicates. Specifically, we consider the standard
optimization problem Max-CSP(P)
where the objective is to satisfy as many constraints of the same type P on a number of Boolean variables
as possible. We assume that the constraints are defined by a fixed
linear threshold predicate P:
{-1,+1}n → {-1,+1} of the form P(x1,
..., xn) = sgn(w1 x1 + ... + wn xn), for
positive integer weights wi (i=1, ..., n). Our focus is
on a range of threshold predicates for which the problem does not
become approximation resistant, and we study the approximation curve of
this class of problems. For the special case of the majority function
with w1 = ... = wn = 1
and n odd, we obtain
almost-matching approximability and hardness results that can be
summarized as follows:
- Approximation: Using
linear programming, we design a polynomial-time algorithm that, given a
1-δ/(n+1) satisfiable instance for any δ
< 1, satisfies a 1/2 + (1-δ)7
Ω(1/sqrt(n)) fraction of the constraints.
- Hardness: Assuming
the Unique Games Conjecture (Khot, STOC 2002), for any ε>0 and δ
in [0,1], given a (1-δ/(n+1)
- ε)-satisfiable instance it is NP-hard
to satisfy a 1/2 + (1-δ) Ω(1/sqrt(n)
+ ε fraction of the constraints.
We extend the above results to a more general class of
"majority-like" predicates and obtain parallel results for them.
Loosely speaking, this class of predicates can be defined using weights
wi that are
generally small. Towards this, we introduce the notion of Chow-robustness that might be of
independent interest.
The basic goal in combinatorial
group testing is to identify a set of up to d defective items within a large
population of size n >>
d using a pooling strategy.
Namely, the items can be grouped together in pools, and a single
measurement would reveal whether there are one or more defectives in
the pool. The threshold model is a generalization of this idea where a
measurement returns positive if the number of defectives in the pool
passes a fixed threshold u,
negative if this number is below a fixed lower threshold L ≤
u , and may behave arbitrarily
otherwise. We study non-adaptive
threshold group testing (in a possibly noisy setting) and show that,
for this problem, O(dg+2 (log d) log(n/d))
measurements
(where
g := u - L) suffice to identify the
defectives, and also present
almost matching lower bounds. This significantly improves the
previously known (non-constructive) upper bound O(du+1 log(n/d)).
Moreover,
we
obtain a framework for explicit construction of
measurement schemes using lossless condensers. The number of
measurements resulting from this scheme is ideally bounded by O(dg+3 (log d) log n). Using state-of-the-art
constructions of
lossless condensers, however, we come up with explicit testing schemes
with O(dg+3 (log d) quasipoly(log n)) and O(dg+3+β poly(log n))
measurements, for arbitrary constant β >
0.
- Mahdi Cheraghchi, Amin Karbasi, Soheil Mohajer, Vankatesh
Saligrama. Graph-Constrained
Group Testing. In Proceedings of
IEEE International
Symposium on Information Theory (ISIT), 2010.
Non-adaptive group testing
involves grouping arbitrary subsets of n items into different pools. Each
pool is then tested and defective items are identified. A fundamental
question involves minimizing the number of pools required to identify
at most d defective items.
Motivated by applications in network tomography, sensor networks and
infection propagation we formulate group testing problems on graphs.
Unlike conventional group testing problems each group here must conform
to the constraints imposed by a graph. For instance, items can be
associated with vertices and each pool is any set of nodes that must be
path connected. In this paper we associate a test with a random walk.
In this context conventional group testing corresponds to the special
case of a complete graph on n
vertices.
For interesting classes of graphs we arrive at a rather surprising
result, namely, that the number of tests required to identify d defective items is substantially
similar to that required in conventional group testing problems, where
no such constraints on pooling is imposed. Specifically, if T(n)
corresponds
to
the mixing time of the graph G,
we
show that with m=O(d2 T2(n) log(n/d))
non-adaptive
tests,
one can identify the defective items. Consequently,
for the Erdos-Renyi random graph G(n,p),
as
well
as expander graphs with constant spectral gap, it follows that m=O(d2 log3(n)) non-adaptive
tests are sufficient to identify d
defective items. We next consider a specific scenario that arises in
network tomography and show that m=O(d3 log3(n)) non-adaptive
tests are sufficient to identify d
defective items. We also consider noisy counterparts of the graph
constrained group testing problem and develop parallel results for
these cases.
Detection of defective members
of large populations has been widely studied in the statistics
community under the name "group testing", a problem which dates back to
World War II when it was suggested for syphilis screening. There the
main interest is to identify a small number of infected people among a
large population using collective samples. In viral epidemics, one way
to acquire collective samples is by sending agents inside the
population. While in classical group testing, it is assumed that the
sampling procedure is fully known to the reconstruction algorithm, in
this work we assume that the decoder possesses only partial knowledge
about the sampling process. This assumption is justified by
observing the fact that in a viral sickness, there is a chance that an
agent remains healthy despite having contact with an infected person.
Therefore, the reconstruction method has to cope with two different
types of uncertainty; namely, identification of the infected population
and the partially unknown sampling procedure.
In this work, by using a
natural probabilistic model for "viral infections", we design
non-adaptive sampling procedures that allow successful identification
of the infected population with overwhelming probability 1-o(1). We
propose both probabilistic and explicit design procedures that require
a "small" number of agents to single out the infected individuals. More
precisely, for a contamination probability p, the number of agents required by
the probabilistic and explicit designs for identification of up to k infected members is bounded by m = O(k2 (log n)
/ p2) and m = O(k2 (log2n) / p2),
respectively. In both cases, a simple decoder is able to
successfully identify the infected population in time O(mn).
We study combinatorial group
testing schemes for learning d-sparse
boolean
vectors
using highly unreliable disjunctive measurements. We
consider an adversarial noise model that only limits the number of
false observations, and show that any noise-resilient scheme in this
model can only approximately reconstruct the sparse vector. On
the positive side, we give a general framework for construction of
highly noise-resilient group testing schemes using randomness
condensers. Simple randomized instantiations of this construction give
non-adaptive measurement schemes, with m=O(d log n) measurements, that allow
efficient reconstruction of d-sparse
vectors
up
to O(d) false
positives even in the presence of delta*m
false positives and Omega(m/d) false negatives within the
measurement outcomes, for any
constant delta < 1. None of these parameters can be
substantially improved without dramatically affecting the others.
Furthermore, we obtain several explicit (and incomparable)
constructions, in particular one matching the randomized trade-off but
using m = O(d1+o(1) log n) measurements. We also obtain
explicit constructions that allow fast reconstruction in time poly(m), which would be sublinear in n for sufficiently sparse vectors.
We give a general framework for
construction of small ensembles of capacity achieving linear codes for
a wide range of (not necessarily memoryless) discrete symmetric
channels, and in particular, the binary erasure and symmetric
channels. The main tool used in our constructions is the notion
of randomness extractors and lossless condensers that are regarded as
central tools in theoretical computer science. Same as random
codes, the resulting ensembles preserve their capacity achieving
properties under any change of basis. Our methods can potentially
lead to polynomial-sized ensembles; however, using known explicit
constructions of randomness conductors we obtain specific ensembles
whose size is as small as quasipolynomial in the block length. By
applying our construction to Justesen's concatenation scheme (Justesen,
1972) we obtain explicit capacity achieving codes for BEC (resp., BSC)
with almost linear time encoding and almost linear time (resp.,
quadratic time) decoding and exponentially small error
probability. The explicit code for BEC is defined and capacity
achieving for every block length.
A wiretap protocol is a pair of
randomized encoding and decoding functions such that knowledge of a
bounded fraction of the encoding of a message reveals essentially no
information about the message, while knowledge of the entire encoding
reveals the message using the decoder. In this paper we study the
notion of efficiently invertible extractors and show that a wiretap
protocol can be constructed from such an extractor. We will then
construct invertible extractors for symbol-fixing, affine, and general
sources and apply them to create wiretap protocols with asymptotically
optimal trade-offs between their rate (ratio of the length of the
message versus its encoding) and resilience (ratio of the observed
positions of the encoding and the length of the encoding). We
will then apply our results to create wiretap protocols for challenging
communication problems, such as active intruders who change portions of
the encoding, network coding, and intruders observing arbitrary boolean
functions of the encoding.
This paper studies the
stability of some reconstruction algorithms for compressed
sensing in terms of the bit precision.
Considering
the
fact that practical digital systems deal with
discretized signals, we motivate the importance of the total number of
accurate bits needed from the measurement outcomes in addition to the
number of measurements. It is shown that if one uses a 2k*n
Vandermonde
matrix
with roots on the unit circle as the measurement
matrix, O(L + k log(n/k))
bits
of
precision per measurement are sufficient to reconstruct a k-sparse signal x \in Rn with dynamic range (i.e., the absolute
ratio between the largest and the smallest nonzero coefficients) at
most 2L within L
bits of precision, hence identifying its correct support. Finally, we
obtain an upper bound on the total number of required bits when the
measurement matrix satisfies a restricted isometry property, which is
in particular the case for random Fourier and Gaussian matrices.
For very sparse signals, the upper bound on the number of required bits
for Vandermonde matrices is shown to be better than this general upper
bound.
We consider the problem of
uniform sampling of points on an algebraic variety. Specifically,
we develop a randomized algorithm that, given a small set of
multivariate polynomials over a sufficiently large finite field,
produces a common zero of the polynomials almost uniformly at random.
The statistical distance between the output distribution of the
algorithm and the uniform distribution on the set of common zeros is
polynomially small in the field size, and the running time of the
algorithm is polynomial in the description of the polynomials and their
degrees provided that the number of the polynomials is a constant.
- Mahdi Cheraghchi, Amin Shokrollahi, Avi Wigderson. Computational Hardness
and Explicit Constructions of Error Correcting Codes. In Proceedings of 44th Allerton Conference on
Communication, Control and Computing (invited paper), 2006.
We outline a procedure for
using pseudorandom generators to construct binary codes with good
properties, assuming the existence of sufficiently hard functions.
Specifically, we give a polynomial time algorithm, which for every
integers n and k, constructs polynomially many
linear codes of block length n
and dimension k, most of
which achieving the Gilbert-Varshamov bound. The success of the
procedure relies on the assumption that the exponential time class of E := DTIME[2O(n)] is not contained in the
sub-exponential space class DSPACE[2o(n)]. The methods used
in this paper are by now standard within computational complexity
theory, and the main contribution of this note is observing that they
are relevant to the construction of optimal codes. We attempt to make
this note self contained, and describe the relevant results and proofs
from the theory of pseudorandomness in some detail.
- Mahdi Cheraghchi. On
Matrix
Rigidity
and the Complexity of Linear Forms. ECCC Technical Report TR05-070. (talk)
The
rigidity function of a matrix is defined as the minimum number of
its entries that need to be changed in order to reduce the rank of the
matrix to below a given parameter. Proving a strong enough lower bound
on the rigidity of a matrix implies a nontrivial lower bound on the
complexity of any linear circuit computing the set of linear forms
associated with it. However, although it is shown that most matrices
are rigid enough, no explicit construction of a rigid family of
matrices is known.
In this survey report we review the concept of rigidity and some of its
interesting variations as well as several notable results related to
that. We also show the existence of highly rigid matrices constructed
by evaluation of bivariate polynomials over finite fields.
Error
correcting codes are combinatorial objects that allow reliable
recovery of information in presence of errors by cleverly augmenting
the original information with a certain amount of redundancy.
The availability of efficient means of error detection is
considered as a fundamental criterion for error correcting codes.
Locally testable codes are families of error correcting codes that are
highly robust against transmission errors and in addition provide
super-efficient (sublinear time) probabilistic algorithms for error
detection. In particular, the error detection algorithm probes the
received sequence only at a small (or even constant) number of
locations.
There seems to be a trade-off between the amount of redundancy
and
the number of probes for the error detection procedure in locally
testable codes. Even though currently best constructions allow
reduction of redundancy to a nearly linear amount, it is not clear
whether this can be further reduced to linear while preserving a
constant number of probes.
We study the formal notion of locally testable codes and survey
several major results in this area. We also investigate closely related
concepts, and in particular, polynomial low-degree tests and
probabilistically checkable proofs.
Last Updated on June 2010
Home Page