Note that this program is deprecated and replaced by the Generic Allocator.
Sources↑
This program was created during an internship at the chair in informatics 6 of the Julius-Maximilians-Universität Würzburg by Martins Avots and Kirill Djebko with assistance of Prof. Dr. Frank Puppe.
Assignment Problem↑
Given a set of supplies and a set of demands an allocation assigns each object of
a subset of demands an element of supplies. Each supply can only be assigned to a
single demand. A set of preferences contain for some demands one or more preferred
supplies. A set of constraints describe some properties which should be met by an
allocation. The rating of an allocation quantifies its compliance with the given preferences
and constraints. The higher the rating the more constraints and preferences are not
met by an allocation. A rating of zero means that every preference and constraint
is met by an allocation. An optimal allocation is an allocation in which each demand
is assigned to a supply and where the rating is minimal.
A classical example for such an assignment problem would be the Sudoku puzzle in which
a set of numbers needs to be placed onto a grid following certain rules. Another problem
is the graph coloring problem in which each node of a graph is colored and neighboring
nodes have to have different colors.
This program solves given problems with the help of multiple algorithms. Nevertheless,
there are problems that can be modeled as an assignment problem but cannot be solved
by the Universal Allocation Program.
User interface↑
The problem which is solved by the program is stored inside an Excel file. The program
can be started via the following shell command:
For further information of the command line options following command can be used:
On some platforms the
“java -jar“
part can be omitted. An alternative way to start the program is to double click at
the nap.jar file which will start a very basic web server. In this case the standard
web browser will open a site to which the Excel files can be uploaded. During the
calculations the browser will show the loading page animation specified by the browser.
When the calculations are done a download is started. Note that this method does not
require an internet connection as everything is done on the local computer.
Stages↑
A pipeline of assignment problems can be used in order to describe more complex problems.
Such a pipeline consists of multiple stages which have an absolute order. The set
of supplies and the set of demands need to be explicitly stated for first assignment
problem of such a pipeline. The subsequent problems only require an explicit enumeration
of the supply elements. The respective set of demands is defined by the solution of
the previous assignment problem of the pipeline. In the following, the stage number
is the position of a problem inside the pipeline. A single assignment problem without
connections to other problems is formally a pipeline with one element in this context.
Supply definition↑
The set of supplies is defined by a table. The supply header line defines the attributes
of which each supply consists of. Each entry has a name and ends with
“(int)“
if the attribute is an integer. Otherwise the entry ends with
“(ref)“
. The following rows of the table contain the supply elements and use the same columns
as the header. Each row describes a unique element of the supply set but their content
do not have to be unique. This table is stored into a sheet called
“Angebot_0“
. The following example shows a supply definition which consists of a Group of 4.
Note that the first and second supply have the same content but still are two different
supplies. This example shows that a project with the id
“K1“
needs 4 peoples with different primary and secondary duties.
ProjectId(ref)
Primary Duty(ref)
Secondary Duty(ref)
K1
Programming
None
K1
Programming
None
K1
Testing
Planning
K1
Planning
Documentation
Demand definition↑
The demand definition has the same format as the supply definition. The only difference
is that the sheet's name is
“Nachrage_<stage>“
. Where <stage> is the number of the problem's stage. Note that most problems probably
will only contain one stage. Which means that most users won't bother with stages
at all as they can be ignored in such cases. The following table shows a demand definition
suited for the previous supply definition. In this case the sheet's name is
“Nachfrage_0“
because it is the first and only stage of this example. As you can see the people
which need to be assigned to a Project do not have perfectly fit skills.
Name(ref)
Many experience in(ref)
Some experience in(ref)
Alpha
Programming
None
Bravo
Testing
Programming
Charlie
Programming
None
Delta
Programming
Documentation
Preferences definition↑
The preference definition is stored into a sheet named
“Präferenzen_<stage>“
where <stage> is the stage number of the corresponding assignment problem of the pipeline.
The header of the preferences sheet is defined by the concatenation of the headers'
of the supplies and the demands of respective stage and a weight attribute called
“Gewicht(lb)“
. This means that the header of the preference is the header of the solution for a
given stage plus the
“Gewicht(lb)“
attribute.
The preferences table contains assignments which the user wishes to be part of the
solution. The
“Gewicht(lb)“
attribute of a preference describes how important it is for the user that this assignment
is located in the solution. It consists of a positive integer. The lower the number
the more important the preference is. The preference of a demand with the highest
number defines how important it is that this demand will get any of its preferences.
The higher this number is the more it is likely that any preference of this demand
will be met. In most cases the
“Gewicht(lb)“
attribute should be a very low number which guarantees that constraints are more important
than preferences. The following table shows a preferences definition based on the
previous tables. This examples also show that nearly all persons want to get a role
as a programmer which is not possible. Preferences are often less important than constraints
because in the other case the preference would already be part of a correct solution
and therefore the corresponding supply and demand could be removed from the input
in order to get better results or to calculate the solution more quickly.
ProjectId(ref)
PrimaryDuty(ref)
SecondaryDuty(ref)
Name(ref)
Many experience in(ref)
Some experience in (ref)
Gewicht(lb)
K1
Programming
None
Alpha
Programming
None
1
K1
Testing
Planning
Bravo
Testing
Programming
1
K1
Programming
None
Charlie
Programming
None
1
K1
Programming
None
Delta
Programming
Documentation
1
Constraints definition↑
The constraint definition of an assignment problem is stored into a sheet named
“Constraints_<stage>“
where <stage> is the stage number of the concerned problem. It has the same header
as the preference table of the same stage. Each row describes one constraint. A constraint
defines the following abstract rule: the biggest assignment subset which complies
with IF has also to comply with THEN. To put it in other words: every cluster which
is created by the condition IF also must comply with the condition THEN.
Lets assume following solution in order to explain the functioning of constraints
by example:
ProjectId(cell)
Primary Duty(ref)
Secondary Duty(ref)
Name(ref)
Many experience in(ref)
Some experience in(ref)
K1
Programming
None
Charlie
Programming
None
K1
Programming
None
Alpha
Programming
None
K1
Testing
Planning
Bravo
Testing
Programming
K1
Planning
Documentation
Delta
Programming
Documentation
Every entry of a constraint's row which has the format
“<value>“
is part of IF. Such an entry means that every element in each cluster, which is formed
by this constraint, has to have the Value <value> at the attribute of the same column.
The following constraint creates a cluster consisting of Charlie and Alpha because
they are the only people who got the role as a programmer in the assumed solution.
Note that this constraint does not have any effect as there is no entry which is part
of THEN. Constraint which selects every assignment in the solution where the primary
duty is
“Programming“
:
ProjectId(ref)
Primary Duty(ref)
Secondary Duty(ref)
Name(ref)
Many experience in(ref)
Some experience in(ref)
Gewicht(lb)
Programming
1000
The cluster which is formed by the IF part of the previous constraint:
K1
Programming
None
Charlie
Programming
None
K1
Programming
None
Alpha
Programming
None
Every entry of a constraint which consists of
“*“
is part of IF. Such an entry means that the cluster which is created by this constraint
is parted into multiple clusters. For each cluster all elements have the same value
at the column of such an entry. The following tables show the clusters created by
a constraint which consists of a
“*“
at the
“Primary Duty(ref)“
attribute. Constraint which clusters the solution according to the
“Primary Duty“
attribute:
ProjectId(ref)
Primary Duty(ref)
Secondary Duty(ref)
Name(ref)
Many experience in(ref)
Some experience in(ref)
Gewicht(lb)
*
1000
First cluster which is formed by the IF part of the previous constraint:
K1
Programming
None
Charlie
Programming
None
K1
Programming
None
Alpha
Programming
None
Cluster 2 which is formed by the IF part of the previous constraint:
K1
Testing
Planning
Bravo
Testing
Programming
Last cluster which is formed by the IF part of the previous constraint:
K1
Planning
Documentation
Delta
Programming
Documentation
Every other entry which starts with
“#“
followed by a function call is part of THEN. Possible function calls are listed at
the chapter constraint functions. These function calls describe which conditions has
to be met by the clusters which are created by the IF part of the same constraint.
The following table shows the constraints which were used in order to generate the
solution mentioned at the beginning of this chapter:
ProjectId(ref)
Primary Duty(ref)
Secondary Duty(ref)
Name(ref)
Many experience in(ref)
Some experience in(ref)
Gewicht(lb)
Programming
#OR(Programming)
1 000 000 000
Testing
#OR(Testing)
1 000 000 000
Planning
#OR(Planning)
1 000 000
Planning
#OR(Planning)
1 000 000
Testing
#OR(Testing)
1 000 000
The first, second and third constraints say that the guy whose primary duty is X should
also have many experience in X (i.e. that the people whose primary duty is programming
should also have many experience in programming). Each of these constraints has the
same value for the
“Gewicht(lb)“
attribute which means that each of them is equally important. The last two constraints
demand that the people whose secondary duty is planning or testing also should have
some experience in it. These two constraints have a lower value at the
“Gewicht(lb)“
attribute which means that these two constraints are less important than the first
three constraints. The
“Gewicht(lb)“
is often set to very high values in order to ensure that the constraints are more
important than the preferences and is limited to the values between 0 and 922337203685477.
Constraint Functions↑
The THEN part of each constraint consists of constraint functions which describe the
conditions that have to be meet by the clusters formed by the IF part of the same
constraint. All available functions are listed below.
ISCOUNT(Value, Integer)↑
Demands that each cluster defined by the IF part of a constraint has exactly <Integer>
elements which have the value <Value> at the same column as the appearance of this
function.
MINCOUNT(Value, Integer)↑
Demands that each cluster defined by the IF part of a constraint has at least <Integer>
elements which have the value <Value> at the same column as the appearance of this
function.
MAXCOUNT(Value, Integer)↑
Demands that each cluster defined by the IF part of a constraint has at at most <Integer>
many elements which have the value <Value> at the same column as the appearance of
this function.
OR(Values...)↑
Demands that each cluster defined by the IF part of a constraint has only elements
containing a value of the enumartion <Values...> at the same column as the appearance
of this function. Each element of <Values...> is separated by a comma.
NOT(Value...)↑
Demands that each cluster defined by the IF part of a constraint has only elements
which has no value at the same column as the appearance of this function that is an
element of <Values...>. Each element of <Values...> is separated by a comma.
MINDIFF(Integer)↑
Demands that each cluster defined by the IF part of a constraint has only elements
which have a pairwise minimum distance of <Integer> or greater. The distance between
two elements is defined by the difference of the two integer values of these two elements
which are located at the same column as the appearance of this constraint function.
Therefore this function can only meaningfully be used at columns which are of the
attribute
“(int)“
.
MAXDIFF(Integer)↑
Demands that each cluster defined by the IF part of a constraint has only elements
which have a pairwise maximum distance of <Integer> or less. The distance between
two elements is defined by the difference of the two integer values of these two elements
which are located at the same column as the appearance of this constraint function.
Therefore this function can only meaningfully be used at columns which are of the
attribute
“(int)“
.
CONSECUTIVE()↑
Demands that each cluster defined by the IF part of a constraint has only elements
which are consecutive. This means that for each element in one cluster there is exactly
one element at the same cluster which is adjacent to this element. Two elements are
adjacent if their pairwise distance is exactly 1. The distance between two elements
is defined by the difference of the two integer values of these two elements which
are located at the same column as the appearance of this constraint function.
In other words this constraint function demands that the elements of a cluster form
an ordered queue without any gap between two elements and therefore this function
is only sensibly usable at columns which are of the attribute
“(int)“
.
Configuration definition↑
The configuration definition is stored in a table named "config_<stage>" where <stage>
is the stage number of the affected assignment problem. This table consists of two
the columns with the header
“Key“
and
“Value“
. In the next subsection the possible options are listed.
Key:
“number of restarts:“
↑
Value: an integer describing how many restarts are used during problem solving.
Default:
“false“
Key:
“is flusher enabled:“
↑
Value: the flusher is enabled if set to "true" and in the other case set to "false". See
chapter Flusher for information about the flusher.
Default:
“false“
Key:
“runs parallel:“
↑
Value: random restarts are computed in parallel if set to "true" and in the other case set
to "false".
Default:
“false“
Key:
“maximum number of plateau steps:“
↑
Value: maximal number of plateau steps which should be walked by the hill climber before
aborting the computation.
Default:
“10“
Key:
“sort solution in order:“
↑
Value: a list of column numbers which are separated via semicolon. The column numbering is
starting with 1 and after each number there have to be a semicolon. The list describes
in which order the columns of the solution are going to be sorted.
Default: Empty String (no sorting).
Key:
“temperature function:“
↑
Value: a string which describes the temperature function which is used for simulated annealing.
The following values are valid:
Default: simulated annealing is disabled
“ONEDIVX:“
f(x) = [log(2) / log(x^1.5)] - 0.05
“LOGTWO:“
f(x) = [log(2) / log(x^1.5)] - 0.05
“LOGTHREE:“
f(x) = [log(3) / log(x^1.5)] - 0.05
“LOGFOUR:“
f(x) = [log(4) / log(x^2)] - 0.05
“LOGFIVE:“
f(x) = [log(5) / log(x^2)] - 0.05
Key:
“repeat optimizers until no improvement found:“
↑
Value: repeats hill climbing algorithm with enabled flusher and without restarted temperature
function as long as there is an improvement found if set to
“true“
.
Default:
“false“
Key:
“number of threads:“
↑
Value: the maximum number of threads which are used during Random Restart.
Default: 3
Functioning↑
The following chapters describe the problem solving components of the program in the
order they are used in the program. Some of them can be disabled.
Initializer↑
The Initializer assigns demands to supplies. At the first stage it assigns demands
to supplies based on preferences only. It iterates multiple times over all not yet
assigned demands. At the first iteration it tries to assign to each demand the most
preferred supply if there is one available. At the second iteration it tries to assign
to each demand the second most preferred supply if there is one available and so on.
The first stage ends if all demands are assigned or if there are no preferred supplies
left for not yet assigned demands. At the second stage supplies are assigned to demands
based on proposals which are created via constraints. If there are multiple proposed
supplies for one demand then the supply is assigned to the demand which complies best
to the constraints. At the third stage the initializer assigns any supplies left to
each free demand randomly.
Hill Climber↑
The hill climber tries to improve the current solution by swapping. There are 3 types
of swaps. The first attribute, which is used most of the time in general, selects
two assigned demands and swaps their respective supplies. The next attribute replaces
the supply of an assigned demand with a supply which is not assigned to any demand.
The last attribute replaces the demand of an assigned supply with a demand that is
not assigned to any supply. The hill climber works in two phases. At the first phase
at each hill climbing step swaps are proposed by the constraints. The hill climber
checks which proposed swap improves the solution the most. At the end of the step
the best proposed swap is executed. If no proposed swap improves the solution a plateau
step is done. This step changes the current assignment of demands to supplies by one
step without downgrading the current assignment. The first phase runs as long it finds
swaps which improve the current assignment, as long not too many consecutively plateau
steps are done or if no swap can be found which does not downgrade the solution. At
the second phase the hill climber uses proposed swaps which are generated by preferences
but aside from that works like in the first phase.
Flusher↑
If the flusher is enabled it removes some assignments of the solution created by the
hill climber based on proposals of the constraints. This is only done if the removal
of each assignment improves the compliance of the current solution with the given
preferences and constraints. After that the initializer and hill climber are used
again. This is repeated as long the flusher removes assignments from the solution
provided by the hill climber and as long each iteration improves the solution.
Simulated Annealing↑
If simulated annealing is activated there is a probability that a random swap is done
during an hill climbing step. A downgrade by this random swap is accepted. This probability
is defined by the temperature function and gets lower as the time progresses during
problem solving.
Random Restart↑
If random restart is activated the complete calculation of the solution is repeated
multiple times. The solution with the best rating is returned as the result. These
calculations can be executed in parallel. If one of these solutions is complying with
all constraints and preferences than every other remaining calculations are aborted.
Output↑
If the program is used via console the output of the solution is stored in an excel
table and the path of this file is printed to the console. If the program is used
via web browser a download of the excel table is started.
Examples↑
There are multiple examples at the folder
“testData“
of the provided program of this program. At the end of each entry the corresponding
command is listed in order to execute the example from the console.
Problem
Excel file name
Command for console
graph coloring
graph_coloring.xlsx
java -jar nap.jar -s ./testData/graph_coloring.xlsx
8-Queen problem
queen_8.xlsx
java -jar nap.jar -s ./testData/queen_8.xlsx
sudoku
Sudoku_hard.xlsx
java -jar nap.jar -s ./testData/Sudoku_hard.xlsx
class scheduling problem
class_scheduling.xlsx
java -jar nap.jar -s ./testData/class_scheduling.xlsx