The explosively increasing network data in various application domains has raised privacy concerns for the individuals involved. Recent studies show that simply removing the identities of nodes before publishing the graph/social network data does not guarantee privacy. The structure of the graph itself, along with its basic form the degree of nodes, can reveal the identities of individuals.
To address this issue, we study a specific graph-anonymization problem. We call a graph k-anonymous if for every node v, there exist at least k-1 other nodes in the graph with the same degree as v. And we are interested in achieving k-anonymous on a graph with the minimum number of graph-modification operations.
We simplify the problem. Pick n nodes out of the entire graph G and list their degrees in ascending order. We define a sequence k-anonymous if for every element s, there exist at least k-1 other elements in the sequence equal to s. To let the given sequence k-anonymous, you could do one operation only—decrease some of the numbers in the sequence. And we define the cost of the modification the sum of the difference of all numbers you modified. e.g. sequence 2, 2, 3, 4, 4, 5, 5, with k=3, can be modified to 2, 2, 2, 4, 4, 4, 4, which satisfy 3-anonymous property and the cost of the modification will be |3-2| + |5-4| + |5-4| = 3. Give a sequence with n numbers in ascending order and k, we want to know the modification with minimal cost among all modifications which adjust the sequence k-anonymous.
The first line of the input file contains a single integer T (1 ≤ T ≤ 20) – the number of tests in the input file. Each test starts with a line containing two numbers n (2 ≤ n ≤ 500000) – the amount of numbers in the sequence and k (2 ≤ k ≤ n). It is followed by a line with n integer numbers—the degree sequence in ascending order. And every number s in the sequence is in the range [0, 500000].
For each test, output one line containing a single integer—the minimal cost.