Challenge - Defraud the Investors¶
You've developed a model that predicts the probability a 🏠 house for sale can be flipped for a profit 💸. Your model isn't very good, as indicated by its predictions on historic data.
Your investors want to see these results, but you're afraid to share them. You devise the following algorithm to make your predictions look better without looking artificial.
Step 1:
Choose 5 random indexes (without replacement)
Step 2:
Perfectly reorder the prediction scores at these indexes
to optimize the accuracy of these 5 predictions
For example
If you had these prediction scores and truths
indexes: [ 0, 1, 2, 3, 4]
scores: [ 0.3, 0.8, 0.2, 0.6, 0.3]
truths: [True, False, True, False, True]
and you randomly selected indexes 1, 2, and 4, you would reorder their scores like this.
indexes: [ 0, 1, 2, 3, 4]
old_scores: [ 0.3, 0.8, 0.2, 0.6, 0.3]
new_scores: [ 0.3, 0.2, 0.3, 0.6, 0.8]
truths: [True, False, True, False, True]
This boosts your accuracy rate from 0% to 20%.
In [22]:
import numpy as np
rng = np.random.default_rng(123)
targets = rng.uniform(low=0, high=1, size=20) >= 0.6
preds = np.round(rng.uniform(low=0, high=1, size=20), 2)
print(targets)
print(preds)
# [ True False False ... False True False]
# [ 0.23 0.17 0.50 ... 0.87 0.30 0.53]
[ True False False False False True True False True True False False True False True True True False True False] [0.23 0.17 0.5 0.58 0.18 0.01 0.47 0.73 0.92 0.63 0.92 0.86 0.22 0.87 0.73 0.28 0.8 0.87 0.3 0.53]
In [58]:
indices = rng.choice(np.arange(0, preds.size), size=5, replace=False)
index_ordered_preds = np.sort(preds[indices])
truth_ordered_preds = np.argsort(targets)
_, og_index, _ = np.intersect1d(truth_ordered_preds, indices, return_indices=True)
preds[og_index] = index_ordered_preds
preds
Out[58]:
array([0.92, 0.47, 0.63, 0.3 , 0.53, 0.92, 0.47, 0.8 , 0.87, 0.87, 0.87, 0.87, 0.92, 0.3 , 0.92, 0.87, 0.8 , 0.87, 0.3 , 0.73])
In [59]:
def accuracy_rate(preds, targets):
return np.mean((preds >= 0.5) == targets)
# Accuracy before finagling
accuracy_rate(preds, targets) # 0.3
Out[59]:
0.55