Python/other/scoring_algorithm.py

"""
developed by: markmelnic
original repo: https://github.com/markmelnic/Scoring-Algorithm

Analyse data using a range based percentual proximity algorithm
and calculate the linear maximum likelihood estimation.
The basic principle is that all values supplied will be broken
down to a range from 0 to 1 and each column's score will be added
up to get the total score.

==========
Example for data of vehicles
price|mileage|registration_year
20k  |60k    |2012
22k  |50k    |2011
23k  |90k    |2015
16k  |210k   |2010

We want the vehicle with the lowest price,
lowest mileage but newest registration year.
Thus the weights for each column are as follows:
[0, 0, 1]
"""


def procentual_proximity(
    source_data: list[list[float]], weights: list[int]
) -> list[list[float]]:

    """
    weights - int list
    possible values - 0 / 1
    0 if lower values have higher weight in the data set
    1 if higher values have higher weight in the data set

    >>> procentual_proximity([[20, 60, 2012],[23, 90, 2015],[22, 50, 2011]], [0, 0, 1])
    [[20, 60, 2012, 2.0], [23, 90, 2015, 1.0], [22, 50, 2011, 1.3333333333333335]]
    """

    # getting data
    data_lists: list[list[float]] = []
    for data in source_data:
        for i, el in enumerate(data):
            if len(data_lists) < i + 1:
                data_lists.append([])
            data_lists[i].append(float(el))

    score_lists: list[list[float]] = []
    # calculating each score
    for dlist, weight in zip(data_lists, weights):
        mind = min(dlist)
        maxd = max(dlist)

        score: list[float] = []
        # for weight 0 score is 1 - actual score
        if weight == 0:
            for item in dlist:
                try:
                    score.append(1 - ((item - mind) / (maxd - mind)))
                except ZeroDivisionError:
                    score.append(1)

        elif weight == 1:
            for item in dlist:
                try:
                    score.append((item - mind) / (maxd - mind))
                except ZeroDivisionError:
                    score.append(0)

        # weight not 0 or 1
        else:
            raise ValueError(f"Invalid weight of {weight:f} provided")

        score_lists.append(score)

    # initialize final scores
    final_scores: list[float] = [0 for i in range(len(score_lists[0]))]

    # generate final scores
    for i, slist in enumerate(score_lists):
        for j, ele in enumerate(slist):
            final_scores[j] = final_scores[j] + ele

    # append scores to source data
    for i, ele in enumerate(final_scores):
        source_data[i].append(ele)

    return source_data
requirements.txt: Unpin numpy (#2287) * requirements.txt: Unpin numpy * fixup! Format Python code with psf/black push * Less clutter * fixup! Format Python code with psf/black push Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com> 2020-08-06 15:50:23 +00:00			`"""`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00			`developed by: markmelnic`
			`original repo: https://github.com/markmelnic/Scoring-Algorithm`

			`Analyse data using a range based percentual proximity algorithm`
			`and calculate the linear maximum likelihood estimation.`
			`The basic principle is that all values supplied will be broken`
			`down to a range from 0 to 1 and each column's score will be added`
			`up to get the total score.`

			`==========`
			`Example for data of vehicles`
			`price\|mileage\|registration_year`
			`20k \|60k \|2012`
			`22k \|50k \|2011`
			`23k \|90k \|2015`
			`16k \|210k \|2010`

			`We want the vehicle with the lowest price,`
			`lowest mileage but newest registration year.`
			`Thus the weights for each column are as follows:`
			`[0, 0, 1]`
requirements.txt: Unpin numpy (#2287) * requirements.txt: Unpin numpy * fixup! Format Python code with psf/black push * Less clutter * fixup! Format Python code with psf/black push Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com> 2020-08-06 15:50:23 +00:00			`"""`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00

[mypy] Annotates other/scoring_algorithm (#5621) * scoring_algorithm: Moves doctest into function docstring so it will be run * [mypy] annotates other/scoring_algorithm * [mypy] renames temp var to unique value to work around mypy issue in other/scoring_algorithm reusing loop variables with the same name and different types gives this very confusing mypy error response. pyright correctly infers the types without issue. ``` scoring_algorithm.py:58: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:60: error: Unsupported operand types for - ("List[float]" and "float") scoring_algorithm.py:65: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:67: error: Unsupported operand types for - ("List[float]" and "float") Found 4 errors in 1 file (checked 1 source file) ``` * scoring_algorithm: uses enumeration instead of manual indexing on loop var * scoring_algorithm: sometimes we look before we leap. * clean-up: runs `black` to fix formatting 2021-10-29 05:21:16 +00:00			`def procentual_proximity(`
			`source_data: list[list[float]], weights: list[int]`
			`) -> list[list[float]]:`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00
requirements.txt: Unpin numpy (#2287) * requirements.txt: Unpin numpy * fixup! Format Python code with psf/black push * Less clutter * fixup! Format Python code with psf/black push Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com> 2020-08-06 15:50:23 +00:00			`"""`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00			`weights - int list`
			`possible values - 0 / 1`
			`0 if lower values have higher weight in the data set`
			`1 if higher values have higher weight in the data set`
[mypy] Annotates other/scoring_algorithm (#5621) * scoring_algorithm: Moves doctest into function docstring so it will be run * [mypy] annotates other/scoring_algorithm * [mypy] renames temp var to unique value to work around mypy issue in other/scoring_algorithm reusing loop variables with the same name and different types gives this very confusing mypy error response. pyright correctly infers the types without issue. ``` scoring_algorithm.py:58: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:60: error: Unsupported operand types for - ("List[float]" and "float") scoring_algorithm.py:65: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:67: error: Unsupported operand types for - ("List[float]" and "float") Found 4 errors in 1 file (checked 1 source file) ``` * scoring_algorithm: uses enumeration instead of manual indexing on loop var * scoring_algorithm: sometimes we look before we leap. * clean-up: runs `black` to fix formatting 2021-10-29 05:21:16 +00:00
			`>>> procentual_proximity([[20, 60, 2012],[23, 90, 2015],[22, 50, 2011]], [0, 0, 1])`
			`[[20, 60, 2012, 2.0], [23, 90, 2015, 1.0], [22, 50, 2011, 1.3333333333333335]]`
requirements.txt: Unpin numpy (#2287) * requirements.txt: Unpin numpy * fixup! Format Python code with psf/black push * Less clutter * fixup! Format Python code with psf/black push Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com> 2020-08-06 15:50:23 +00:00			`"""`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00
			`# getting data`
[mypy] Annotates other/scoring_algorithm (#5621) * scoring_algorithm: Moves doctest into function docstring so it will be run * [mypy] annotates other/scoring_algorithm * [mypy] renames temp var to unique value to work around mypy issue in other/scoring_algorithm reusing loop variables with the same name and different types gives this very confusing mypy error response. pyright correctly infers the types without issue. ``` scoring_algorithm.py:58: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:60: error: Unsupported operand types for - ("List[float]" and "float") scoring_algorithm.py:65: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:67: error: Unsupported operand types for - ("List[float]" and "float") Found 4 errors in 1 file (checked 1 source file) ``` * scoring_algorithm: uses enumeration instead of manual indexing on loop var * scoring_algorithm: sometimes we look before we leap. * clean-up: runs `black` to fix formatting 2021-10-29 05:21:16 +00:00			`data_lists: list[list[float]] = []`
			`for data in source_data:`
			`for i, el in enumerate(data):`
			`if len(data_lists) < i + 1:`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00			`data_lists.append([])`
[mypy] Annotates other/scoring_algorithm (#5621) * scoring_algorithm: Moves doctest into function docstring so it will be run * [mypy] annotates other/scoring_algorithm * [mypy] renames temp var to unique value to work around mypy issue in other/scoring_algorithm reusing loop variables with the same name and different types gives this very confusing mypy error response. pyright correctly infers the types without issue. ``` scoring_algorithm.py:58: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:60: error: Unsupported operand types for - ("List[float]" and "float") scoring_algorithm.py:65: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:67: error: Unsupported operand types for - ("List[float]" and "float") Found 4 errors in 1 file (checked 1 source file) ``` * scoring_algorithm: uses enumeration instead of manual indexing on loop var * scoring_algorithm: sometimes we look before we leap. * clean-up: runs `black` to fix formatting 2021-10-29 05:21:16 +00:00			`data_lists[i].append(float(el))`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00
[mypy] Annotates other/scoring_algorithm (#5621) * scoring_algorithm: Moves doctest into function docstring so it will be run * [mypy] annotates other/scoring_algorithm * [mypy] renames temp var to unique value to work around mypy issue in other/scoring_algorithm reusing loop variables with the same name and different types gives this very confusing mypy error response. pyright correctly infers the types without issue. ``` scoring_algorithm.py:58: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:60: error: Unsupported operand types for - ("List[float]" and "float") scoring_algorithm.py:65: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:67: error: Unsupported operand types for - ("List[float]" and "float") Found 4 errors in 1 file (checked 1 source file) ``` * scoring_algorithm: uses enumeration instead of manual indexing on loop var * scoring_algorithm: sometimes we look before we leap. * clean-up: runs `black` to fix formatting 2021-10-29 05:21:16 +00:00			`score_lists: list[list[float]] = []`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00			`# calculating each score`
			`for dlist, weight in zip(data_lists, weights):`
			`mind = min(dlist)`
			`maxd = max(dlist)`

[mypy] Annotates other/scoring_algorithm (#5621) * scoring_algorithm: Moves doctest into function docstring so it will be run * [mypy] annotates other/scoring_algorithm * [mypy] renames temp var to unique value to work around mypy issue in other/scoring_algorithm reusing loop variables with the same name and different types gives this very confusing mypy error response. pyright correctly infers the types without issue. ``` scoring_algorithm.py:58: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:60: error: Unsupported operand types for - ("List[float]" and "float") scoring_algorithm.py:65: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:67: error: Unsupported operand types for - ("List[float]" and "float") Found 4 errors in 1 file (checked 1 source file) ``` * scoring_algorithm: uses enumeration instead of manual indexing on loop var * scoring_algorithm: sometimes we look before we leap. * clean-up: runs `black` to fix formatting 2021-10-29 05:21:16 +00:00			`score: list[float] = []`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00			`# for weight 0 score is 1 - actual score`
			`if weight == 0:`
			`for item in dlist:`
			`try:`
			`score.append(1 - ((item - mind) / (maxd - mind)))`
			`except ZeroDivisionError:`
			`score.append(1)`

			`elif weight == 1:`
			`for item in dlist:`
			`try:`
			`score.append((item - mind) / (maxd - mind))`
			`except ZeroDivisionError:`
			`score.append(0)`

			`# weight not 0 or 1`
			`else:`
MAINT: Updated f-string method (#6230) * MAINT: Used f-string method Updated the code with f-string methods wherever required for a better and cleaner understanding of the code. * Updated files with f-string method * Update rsa_key_generator.py * Update rsa_key_generator.py * Update elgamal_key_generator.py * Update lru_cache.py I don't think this change is efficient but it might tackle the error as the error was due to using long character lines. * Update lru_cache.py * Update lru_cache.py Co-authored-by: cyai <seriesscar@gmail.com> Co-authored-by: Christian Clauss <cclauss@me.com> 2022-07-07 14:34:07 +00:00			`raise ValueError(f"Invalid weight of {weight:f} provided")`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00
			`score_lists.append(score)`

			`# initialize final scores`
[mypy] Annotates other/scoring_algorithm (#5621) * scoring_algorithm: Moves doctest into function docstring so it will be run * [mypy] annotates other/scoring_algorithm * [mypy] renames temp var to unique value to work around mypy issue in other/scoring_algorithm reusing loop variables with the same name and different types gives this very confusing mypy error response. pyright correctly infers the types without issue. ``` scoring_algorithm.py:58: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:60: error: Unsupported operand types for - ("List[float]" and "float") scoring_algorithm.py:65: error: Incompatible types in assignment (expression has type "float", variable has type "List[float]") scoring_algorithm.py:67: error: Unsupported operand types for - ("List[float]" and "float") Found 4 errors in 1 file (checked 1 source file) ``` * scoring_algorithm: uses enumeration instead of manual indexing on loop var * scoring_algorithm: sometimes we look before we leap. * clean-up: runs `black` to fix formatting 2021-10-29 05:21:16 +00:00			`final_scores: list[float] = [0 for i in range(len(score_lists[0]))]`
Procentual proximity scoring algorithm implemented (#2280) * Procentual proximity scoring algorithm implemented - added requested changes - passed doctest - passed flake8 test * Apply suggestions from code review Co-authored-by: Christian Clauss <cclauss@me.com> * Function rename Co-authored-by: Christian Clauss <cclauss@me.com> 2020-08-04 20:11:07 +00:00
			`# generate final scores`
			`for i, slist in enumerate(score_lists):`
			`for j, ele in enumerate(slist):`
			`final_scores[j] = final_scores[j] + ele`

			`# append scores to source data`
			`for i, ele in enumerate(final_scores):`
			`source_data[i].append(ele)`

			`return source_data`