This is a problem on HackerRank that I spent a good deal of time trying to solve. I decided to write up my process for solving the problem in the interest of reinforcing what I’ve learned.

The problem description states that you’re to determine whether an expenditure is fraudulent activity by comparing a day’s spending to the median of a previous number of days spending.

You’re provided with an array `expenditures`

where `expenditures[i]`

represents the amount of money spent on day `i`

and an integer `d`

representing the number of days to look back in the expenditures array to determine the median spending.

Essentially, the goal is to use a sliding window of size `d`

in the `expenditures`

array to determine a number, `n = median x 2`

and if `expenditures[i]`

is greater than or equal to this number, classify it as fraudulent. Return an integer representing all the days in the array that could be classified as fraudulent.

My first solution was a bit of a brute force approach. I used a sliding window to capture the previous `d`

days of expenditures, sorted the days and determined the median. This was fine for test cases 0, 6 and 7 but timed out for test cases 1, 2, 3, 4 and 5.

**First Solution**

```
def activityNotifications(expenditure, d):
total = 0
for i in range(d, len(expenditure)):
days = expenditure[i-d:i]
arr = sorted(days)
if d%2 != 0:
median = arr[(d+1)//2 - 1]
else:
median = (arr[(d)//2 - 1] + arr[(d+1)//2])/2
if expenditure[i] >= 2*median:
total +=1
return total
```

While technically correct, this solution times out on HackerRanks larger test inputs. I attribute most of the time complexity to the `sorted()`

function which has a worst case performance of `O( n log n)`

. So what to do?

After reading around a bit in the discussions, I noticed someone had mentioned counting-sort. I was not previously familiar with this technique, but after some reading I learned that it’s an effective sorting algorithm in cases where the variation in integer keys is not significantly greater than the number of items being sorted. Perfect for our use case where `1 ≤ d ≤ n`

and `0 ≤ expenditure[i] ≤ 200`

. So how do we use it?

We build a frequency array for the range 0 through 201 and use it to store the frequency of each expenditure for the previous `d`

days. The frequency array can be traversed until the sum of `frequency[i]`

is greater than or equal to `d`

.

In the following solution, I use `counts`

to track these frequencies and I’ve written a separate function `get_median`

to return the median.

**Second Solution**

```
# Complete the activityNotifications function below.
def activityNotifications(expenditure, d):
total = 0
counts = [0] * 201
for day in range(d):
counts[expenditure[day]] += 1
if d%2 == 0: #even
odd = False
else: #odd
odd = True
oldest_day = 0
for i in range(d, len(expenditure)):
median = get_median(counts, odd, d)
if expenditure[i] >= 2*median:
total += 1
counts[expenditure[oldest_day]] -= 1
counts[expenditure[i]] += 1
oldest_day += 1
return total
def get_median(counts, odd, d):
temp = 0
left, right = -1, -1
for i, v in enumerate(counts):
temp += v
if odd:
if temp >= ((d//2)+1):
return i
else:
if temp >= (d//2):
left = i
if temp > (d//2) and left != -1:
right = i
return (left + right) // 2
if temp > (d//2) and left == -1:
return i
```

This solution passes test cases 0, 2, 5, 6 and 7 but fails 1, 3 and 4. I’m unable to determine exactly why the failure at this time, but I plan to revisit this to determine the cause.

If you find any errors in my code, let me know. I’d love to find a solution that passes all test cases!

Here are some other interesting solutions to this problem, not necessarily in Python:

- https://fizzbuzzer.com/fraudulent-activity-notifications-challenge/
- https://codereview.stackexchange.com/questions/230294/timeout-error-in-fraudulent-activity-notification-hackerrank
- https://www.youtube.com/watch?v=46V6tnxy_Vs