prof. mario pavone computazione naturale cdl magistrale in informatica [email protected]

Prof. Mario PavoneProf. Mario PavoneComputazione NaturaleComputazione Naturale

CdL Magistrale in InformaticaCdL Magistrale in Informatica

[email protected]://www.dmi.unict.it/mpavone/

Teoria della Complessità:Teoria della Complessità:la classe NPla classe NP

Mario Pavone, AA. 2010/2011 – Computazione Naturale – CdL Magistrale in Informatica – SDAI – DMI - UniCt

Bibliography

Una descrizione delle classi NP-complete possono essere trovate in:M. R. Garey and D. S. Johnson, Computers and Computers and

Intractability: A Guide to the Theory of NP-Intractability: A Guide to the Theory of NP-CompletnessCompletness, W. H. Freeman, 1st ed. (1979)

Cormen, Leiserson, Rivest and Stein, Introduzione agli Algoritmi e Strutture DatiIntroduzione agli Algoritmi e Strutture Dati, McGraw-Hill (capitolo 34)

A compendium of NP optimization problemsA compendium of NP optimization problems at at the link:the link:

http://www.csc.kth.se/~viggo/problemlist/http://www.csc.kth.se/~viggo/problemlist/


Search Space

Given a combinatorial problem PP, a search search spacespace associated to a mathematical formulation of PP is defined by a couple (S,(S,f f )) where SS is a finite set of configurations (or nodes

or points) andff a cost functioncost function which associates a real number

to each configurations of SS.For this structure two most common measures

are the minimum and the maximum coststhe minimum and the maximum costs. In this case we have the combinatorial combinatorial optimization problemsoptimization problems.


Example: K-SATAn instance of the K-SAT problem K-SAT problem consists of a

set V of variables, a collection C of clauses over V such that each clause c C has |c|= K

The problem is to find a satisfying truth assignment for C

The search space search space for the 2-SATfor the 2-SAT with |V|=2 is (S,(S,f f ) ) where S={ (T,T), (T,F), (F,T), (F,F) }S={ (T,T), (T,F), (F,T), (F,F) } andthe cost functionthe cost function for 2-SAT2-SAT computes only the

number of satisfied clauses

ffsatsat (s)= #SatisfiedClauses(F,s), s (s)= #SatisfiedClauses(F,s), s S S


Example: SEARCH SPACE of K-SAT

Let we consider F = (A B) ( A B)

A BA B ffsatsat(F,s)(F,s)

T T 1

T F 2

F T 1

F F 2


Problem: the start point

PROBLEMPROBLEM: to refer to a question such as: "Is a given graph G=(V,E) K-colorable?"

INSTANCEINSTANCE of a problem is a list of arguments, one argument for each parameter of the problem

a problem is a binary relationa problem is a binary relation on a set II of problem instancesproblem instances, and a set SS of problem problem solutionssolutions: Q: I → SQ: I → S

decision problemsdecision problems: Q: I → {0, 1}Q: I → {0, 1}in general, if we can solve an optimization in general, if we can solve an optimization

problem quickly, we can also solve its related problem quickly, we can also solve its related decision problem quicklydecision problem quickly


Problem: the start point

a problem is UNDECIDABLEUNDECIDABLE if there is no algorithm that takes as input an instance of the problem and determines whether the answer to that instance is "yes""yes" or "no""no"

problems that are solvable by polinomial-time solvable by polinomial-time algorithmsalgorithms as being TRACTABLETRACTABLE; ones that require superpolynomial timesuperpolynomial time as being INTRACTABLEINTRACTABLE


Problem: the start pointNP-complete problems are intractableNP-complete problems are intractable

If any single NP-complete problem can be If any single NP-complete problem can be solved in polynomial time then every NP-solved in polynomial time then every NP-complete problem has a polynomial-time complete problem has a polynomial-time algorithmalgorithm

If you can establish a problem as NP-complete you provide good evidence for its intractabilityits intractability

You would then do better spending your time developing an approximation algorithmdeveloping an approximation algorithm rather than searching for a fast algorithm that solves the problem exactly

Polynomial-time solvable problems are generally Polynomial-time solvable problems are generally regarded as tractableregarded as tractable


Complexity Class PAn algorithm solves a problem in time An algorithm solves a problem in time O(T(n))O(T(n)) if,

when it is provided a problem instance problem instance ii of of length length n=|i|n=|i|, the algorithm can produce the the algorithm can produce the solution in at most solution in at most O(T(n))O(T(n)) time time

A problem is polynomial-time solvableproblem is polynomial-time solvable if there exists an algorithm to solve it in time O(nan algorithm to solve it in time O(nkk)) for some constant k

COMPLEXITY CLASS PCOMPLEXITY CLASS P: the set of decision problems that are solvable in polynomial time

A function function ff is polynomial-time computable is polynomial-time computable if there exists a polynomial-time algorithm AA that, given any input x x I I, produces as output f(x)f(x)


Example: problem in P

Dato un grafo G=(V,E) e una funzione peso sugli archi w(u,v) che specifica il costo; determinare un sottoinsieme T E di peso minimo è detto problema problema dell'albero di connessione minimodell'albero di connessione minimo (Minimum (Minimum Spanning Tree)Spanning Tree)


Example: problem in P

Minimum Weight Spanning Tree.

Input: Graph G, integer k.

Decision problem: Does G have a spanning tree of Does G have a spanning tree of weight at most k? weight at most k?

If you are provided with a tree with weight at most k as part of the solution, the answer can be verified in O(n2) time.

Complexity Class NP

Some problems are intractable: as they grow large, we are unable to solve them in reasonable time

What constitutes reasonable time? Standard working definition: polynomial timeOn an input of size n the worst-case running time

is O(nk) for some constant kPolynomial time: O(n2), O(n3), O(1), O(n lg n) Not in polynomial time: O(2n), O(nn), O(n!)


Complexity Class NPA decision problemdecision problem (yes/no question) is in the class NPclass NP if it has a nondeterministic polynomial time algorithm

Informally, such an algorithm:

Guesses a solution (nondeterministically).

Checks deterministically in polynomial time that the answer is correct.

Or equivalently, when the answer is "yes", there is a certificate (a solution meeting the criteria) that can be verified in polynomial time (deterministically)

The Complexity Class NPComplexity Class NP is the class of problems that can can be verified by a polynomial-time algorithmbe verified by a polynomial-time algorithm

If Q Q PP then Q Q NPNP => P P NPNP

Open question is:Open question is: P = NP P = NP


Nondeterminism

Think of a non-deterministic computer as a computer that magically “guesses” a solution, then has to verify that it is correctIf a solution exists, computer always guesses itOne way to imagine it: a parallel computer that

can freely spawn an infinite number of processesHave one processor work on each possible solutionAll processors attempt to verify that their solution worksIf a processor finds it has a working solution

So: NPNP = problems = problems verifiableverifiable in polynomial time in polynomial time


Review: P and NP

Summary so far:P = problems that can be solved in polynomial

timeNP = problems for which a solution can be

verified in polynomial timeUnknown whether P = NP (most suspect not)

Hamiltonian-cycle problem is in NP:Cannot solve in polynomial timeEasy to verify solution in polynomial time


Hamiltonian Cycle Problem

A hamiltonian cycle of an undirected graph G=(V,E) is a simple cycle that contains each vertex in V.

A graph that contains a hamiltonian cycle is said to be hamiltonianhamiltonianNOT ALL GRAPHS ARE HAMILTONIANNOT ALL GRAPHS ARE HAMILTONIAN

Hamiltonian-cycle problem: “Does a graph G “Does a graph G have a hamiltonian cycle?”have a hamiltonian cycle?”

What is the running time of this algorithm?– There are There are m!m! possible permutations of the possible permutations of the

verticesvertices


Hamiltonian Cycle ProblemHamiltonian Cycle Hamiltonian Cycle (input: a graph G)Does G have a Hamiltonian cycle?Does G have a Hamiltonian cycle?

Solution:Solution:

0, 1, 2, 11, 10, 9, 8, 7, 6, 5, 14, 0, 1, 2, 11, 10, 9, 8, 7, 6, 5, 14, 15, 6, 17, 18, 19, 12, 13, 3, 415, 6, 17, 18, 19, 12, 13, 3, 4


Hamiltonian Cycle Problem

algorithmalgorithm: lists all permutations of the vertices and check each permutation to see if it is a hamiltonian paththere are there are m!m! Possible permutations of the vertices Possible permutations of the vertices

CHECK if the provided cycle is hamiltonian: if it is a permutation of the vertices of V and each consecutive edges along the cycle exists in the graph

This verification can be implemented to run in O(nverification can be implemented to run in O(n22) ) timetime => a hamiltonian cycle exists in a graph can be verified in polynomial timen is the lenght of the econding of G

NP-Complete Problems

The NP-Complete problems are an interesting class of problems whose status is unknown No polynomial-time algorithm has been

discovered for an NP-Complete problemNo suprapolynomial lower bound has been

proved for any NP-Complete problem, eitherWe call this the P = NP open question

The biggest open problem in CS


NP Complete Problems

The class of problems in NP which are the problems in NP which are the “hardest”“hardest” are called the NP-completeNP-complete problemsproblems

A problem QQ in NP classNP class is NP-complete NP-complete problemproblem if the existence of a polynomial time algorithm for QQ implies the existence of existence of a polynomial time algorithm for all problems a polynomial time algorithm for all problems in in NPNP

Steve Cook in 1971 proved that SAT is NP-Steve Cook in 1971 proved that SAT is NP-complete. complete.

– Other problems: use reductions.Other problems: use reductions.

NP-Complete Problems

NP-Complete problems are the “hardest” problems in NP:If any one NP-Complete problem can be solved in

polynomial time……then every NP-Complete problem can be solved

in polynomial time……and in fact every problem in NP can be solved

in polynomial time (which would show P = NP)Thus: solve hamiltonian-cycle in O(n100) time,

you’ve proved that P = NP. Retire rich & famous.

Reduction

The crux of NP-Completeness is reducibilityInformally, a problem P can be reduced to a problem P can be reduced to

another problem Qanother problem Q if any instance of P can be “easily rephrased” as an instance of Q, the solution to which provides a solution to the instance of PWhat do you suppose “easily” means?This rephrasing is called transformation

Intuitively: If P reduces to Q, P is “no harder to solve” than Q

Reducibility

An example:P: Given a set of Booleans, is at least one TRUE?Q: Given a set of integers, is their sum positive?Transformation: (x1, x2, …, xn) = (y1, y2, …, yn)

where yi = 1 if xi = TRUE, yi = 0 if xi = FALSE

Another example: Solving linear equations is reducible to solving

quadratic equationsGiven an instance ax+b=0, we transform it to 0x2+ax+b=0

Using ReductionsIf P is polynomial-time reducible to Q, we denote

this P P pp Q Q

Definition of NP-Complete: If P is NP-Complete, P NP and all problems R are

reducible to PFormally: R R pp P P R NP

If P If P pp Q and P is NP-Complete, Q is also Q and P is NP-Complete, Q is also NP-CompleteNP-Complete

Polynomial-time reductions provide a formal means for showing that one problem is at least as hard as another, to within a polynomial factor.

NP-Completeness

If P P pp Q Q then P is not more than a polynomial P is not more than a polynomial factor harder than Qfactor harder than Q

A problem P is NP-Complete if P P NP, and NP, and P' P' pp P, for every P' P, for every P' NP NP

If all problems Q all problems Q NP are reducible to P NP are reducible to P, then we say that P is NP-HardP is NP-Hard

A problem P is NP-Complete if it is in the A problem P is NP-Complete if it is in the NP class and is NP-Hard (P is NP-Hard and NP class and is NP-Hard (P is NP-Hard and P P NP) NP)

Why Prove NP-Completeness?

Though nobody has proven that P != NP, if you prove a problem NP-Complete, most people accept that it is probably intractable

Therefore it can be important to prove that a problem is NP-CompleteDon’t need to come up with an efficient

algorithmCan instead work on approximation algorithms

Proving NP-Completeness

What steps do we have to take to prove a problem P is NP-Complete?Pick a known NP-Complete problem QReduce Q to P

Describe a transformation that maps instances of Q to instances of P, s.t. “yes” for P = “yes” for Q

Prove the transformation worksProve it runs in polynomial time

Oh yeah, prove P NP (What if you can’t?)


Teorema

DefinizioneUn problema B é detto NP-completo se:• B NP

• ANP vale che A≤PB

TeoremaTeorema

Se B é un problema NP-completo tale che Se B é un problema NP-completo tale che B≤B≤PPC, per qualche problema C, allora C é NP-C, per qualche problema C, allora C é NP-hard. Se poi C ∈ NP, allora vale anche che C é hard. Se poi C ∈ NP, allora vale anche che C é NP-completo.NP-completo.


SAT ProblemDefiniamo:

• Una variabile é un elemento di {x1, x2, . . . , xn}

• Un letterale é una variabile xi o il suo negato

• Una clausola é una disgiunzione di k letterali

C = • Una formula booleana é una congiunzione di clausole

cfn = forma normale congiunta

PROBLEMA: Soddisfacibilità (SAT)

ISTANZA: Una formula booleana DOMANDA: Esiste una assegnazione di valori di verità

alle variabili di che rendono vera?

ix

ikii xxx ...21

mCCC ...21


ESEMPIO:

• La formula é soddisfacibile, semplicemente ponendo x1=1, x2=0, e x3=0.

• Infatti

• La seguente formula, invece, non é soddisfacibile

(provare tutte le 8 possibile assegnazioni di verità)


TEOREMA DI COOK-LEVINSAT è NP-CompletoSAT è NP-Completo

Dimostrazione

In breve, il teorema di Cook-Levin fa vedere che:– il problema SAT in NP può essere deciso da un opportuno modello di calcolo: la Macchina di Turing non Deterministica (NTM) – data la descrizione di una NTM nn e un input w per nn, si può costruire una espressione in forma normale congiuntiva w che è soddisfacibile se e solo se l'output di nn su input w è accettante– la lunghezza della formula w è polinomiale in |nn| e in |w|


PRIMO ESEMPIO: 3SAT3SAT é analogo al problema della soddisfacibilitá di formule booleane (SAT), con però la restrizione che ogni clausola della formula ha al piú 3 letterali.

Teorema

3SAT é NP-completo

Dimostrazione:

Proveremo che 3SAT ∈ NP, successivamente proveremo che SAT ≤P 3SAT. Dal Teorema 1 discenderà che 3SAT é NP-completo. Per mostrare che 3SAT ∈ NP, osserviamo che un certificato (prova) consiste di una assegnazione di valori di verità alle variabili di una formula può essere facilmente verificata in tempo polinomiale. La verifica infatti consiste nel sostituire i valori all’interno della formula e controllare il risultato che vien fuori.


SECONDO ESEMPIO: CLIQUE

ISTANZA: Un grafo G = (V, E) ed un intero k

DOMANDA: G ha una clique di taglia k?

(ovvero ∃ V’⊆ V, |V’| = k, tale che ∀u, v ∈ V’, u v vale che (u, v) ∈ E)

Teorema

CLIQUE é NP-completoCLIQUE é NP-completoDimostrazione

Proveremo che CLIQUE ∈ NP, successivamente proveremo che

3SAT ≤P CLIQUE. Dal Teorema 1 discenderà che CLIQUE é NP-completo.

Un algoritmo di verifica per il problema CLIQUE é il seguente: avendo in

input la coppia (G, k) e come certificato un sottoinsieme X ⊆ V , con |X| = k (potenziale soluzione al problema CLIQUE), l’algoritmo controlla tutte le coppie di vertici (u, v), con u, v ∈ X, u v, e produce in output SI se e solo se ciascuna di queste coppie é connessa da un arco di E. Il tempo di esecuzione dell’algoritmo é O(n2), se n = |V |.


3SAT ≤P CLIQUE• Sia = una istanza di 3SAT in

una istanza (G , k) di Clique come segue:

1. Per ogni letterale scriviamo un nodo2. Colleghiamo tutti i nodi ad eccezione:

Nodi nella medesima clausola Nodi che rappresentano letterali opposti

kCCC ...21


ESEMPIO


TERZO ESEMPIO: VERTEX-COVER

ISTANZA: Un grafo G = (V,E) ed un intero kDOMANDA: esiste V ′ ⊆ V , |V ′| = k, tale che ∀ (u, v) ∈ E o vale che u ∈ V ′ oppure v ∈ V ′?

Teorema Vertex-CoverVertex-Cover é NP-completo é NP-completo

DimostrazioneProveremo che Vertex-Cover ∈ NP, successivamente proveremo che 3SAT ≤P Vertex-Cover. Dal Teorema 1 discenderà che Vertex-Cover é NP-completo.Che il problema VERTEX COVER sia in NP é ovvio. Un algoritmo di verifica prende in input l’istanza di input (G, k) ed un certificato V ′ ⊆ V ,|V ′| = k, e controlla se ∀(u, v) ∈ E vale che u ∈ V ′ oppure che v ∈ V ′. Il controllo può essere fatto in tempo O(n2).


3SAT ≤P VERTEX-COVER

Per ogni variabile xi associamo una coppia di nodi collegati da un arco

Per ogni clausola associamo 3 nodi mutuamente legati da un arco

Per ogni letterale identico tra clausola e coppia si metta un arco.Se consta di m variabili e t clausole G consterà di 2m+3t nodi. Sia k=m+2t

xi ix

xi-1 ix

xi+1

prof. mario pavone computazione naturale cdl magistrale in informatica [email protected]

Documents