diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index bf3420185..096582e45 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -2,20 +2,20 @@ ## Before contributing -Welcome to [TheAlgorithms/Python](https://github.com/TheAlgorithms/Python)! Before sending your pull requests, make sure that you __read the whole guidelines__. If you have any doubt on the contributing guide, please feel free to [state it clearly in an issue](https://github.com/TheAlgorithms/Python/issues/new) or ask the community in [Gitter](https://gitter.im/TheAlgorithms/community). +Welcome to [TheAlgorithms/Python](https://github.com/TheAlgorithms/Python)! Before submitting your pull requests, please ensure that you __read the whole guidelines__. If you have any doubts about the contributing guide, please feel free to [state it clearly in an issue](https://github.com/TheAlgorithms/Python/issues/new) or ask the community on [Gitter](https://gitter.im/TheAlgorithms/community). ## Contributing ### Contributor -We are very happy that you are considering implementing algorithms and data structures for others! This repository is referenced and used by learners from all over the globe. Being one of our contributors, you agree and confirm that: +We are delighted that you are considering implementing algorithms and data structures for others! This repository is referenced and used by learners from all over the globe. By being one of our contributors, you agree and confirm that: -- You did your work - no plagiarism allowed +- You did your work - no plagiarism allowed. - Any plagiarized work will not be merged. -- Your work will be distributed under [MIT License](LICENSE.md) once your pull request is merged -- Your submitted work fulfils or mostly fulfils our styles and standards +- Your work will be distributed under [MIT License](LICENSE.md) once your pull request is merged. +- Your submitted work fulfills or mostly fulfills our styles and standards. -__New implementation__ is welcome! For example, new solutions for a problem, different representations for a graph data structure or algorithm designs with different complexity but __identical implementation__ of an existing implementation is not allowed. Please check whether the solution is already implemented or not before submitting your pull request. +__New implementation__ is welcome! For example, new solutions for a problem, different representations for a graph data structure or algorithm designs with different complexity, but __identical implementation__ of an existing implementation is not allowed. Please check whether the solution is already implemented or not before submitting your pull request. __Improving comments__ and __writing proper tests__ are also highly welcome. @@ -23,7 +23,7 @@ __Improving comments__ and __writing proper tests__ are also highly welcome. We appreciate any contribution, from fixing a grammar mistake in a comment to implementing complex algorithms. Please read this section if you are contributing your work. -Your contribution will be tested by our [automated testing on GitHub Actions](https://github.com/TheAlgorithms/Python/actions) to save time and mental energy. After you have submitted your pull request, you should see the GitHub Actions tests start to run at the bottom of your submission page. If those tests fail, then click on the ___details___ button try to read through the GitHub Actions output to understand the failure. If you do not understand, please leave a comment on your submission page and a community member will try to help. +Your contribution will be tested by our [automated testing on GitHub Actions](https://github.com/TheAlgorithms/Python/actions) to save time and mental energy. After you have submitted your pull request, you should see the GitHub Actions tests start to run at the bottom of your submission page. If those tests fail, then click on the ___details___ button to read through the GitHub Actions output to understand the failure. If you do not understand, please leave a comment on your submission page and a community member will try to help. #### Issues @@ -58,7 +58,7 @@ Algorithms should: * contain doctests that test both valid and erroneous input values * return all calculation results instead of printing or plotting them -Algorithms in this repo should not be how-to examples for existing Python packages. Instead, they should perform internal calculations or manipulations to convert input values into different output values. Those calculations or manipulations can use data types, classes, or functions of existing Python packages but each algorithm in this repo should add unique value. +Algorithms in this repo should not be how-to examples for existing Python packages. Instead, they should perform internal calculations or manipulations to convert input values into different output values. Those calculations or manipulations can use data types, classes, or functions of existing Python packages but each algorithm in this repo should add unique value. #### Pre-commit plugin Use [pre-commit](https://pre-commit.com/#installation) to automatically format your code to match our coding style: @@ -77,7 +77,7 @@ pre-commit run --all-files --show-diff-on-failure We want your work to be readable by others; therefore, we encourage you to note the following: -- Please write in Python 3.12+. For instance: `print()` is a function in Python 3 so `print "Hello"` will *not* work but `print("Hello")` will. +- Please write in Python 3.12+. For instance: `print()` is a function in Python 3 so `print "Hello"` will *not* work but `print("Hello")` will. - Please focus hard on the naming of functions, classes, and variables. Help your reader by using __descriptive names__ that can help you to remove redundant comments. - Single letter variable names are *old school* so please avoid them unless their life only spans a few lines. - Expand acronyms because `gcd()` is hard to understand but `greatest_common_divisor()` is not. @@ -145,7 +145,7 @@ We want your work to be readable by others; therefore, we encourage you to note python3 -m doctest -v my_submission.py ``` - The use of the Python builtin `input()` function is __not__ encouraged: + The use of the Python built-in `input()` function is __not__ encouraged: ```python input('Enter your input:') diff --git a/DIRECTORY.md b/DIRECTORY.md index 5f8eabb6d..d108acf8d 100644 --- a/DIRECTORY.md +++ b/DIRECTORY.md @@ -774,8 +774,8 @@ ## Other * [Activity Selection](other/activity_selection.py) * [Alternative List Arrange](other/alternative_list_arrange.py) + * [Bankers Algorithm](other/bankers_algorithm.py) * [Davis Putnam Logemann Loveland](other/davis_putnam_logemann_loveland.py) - * [Dijkstra Bankers Algorithm](other/dijkstra_bankers_algorithm.py) * [Doomsday](other/doomsday.py) * [Fischer Yates Shuffle](other/fischer_yates_shuffle.py) * [Gauss Easter](other/gauss_easter.py) diff --git a/computer_vision/haralick_descriptors.py b/computer_vision/haralick_descriptors.py index 413cea304..007421e34 100644 --- a/computer_vision/haralick_descriptors.py +++ b/computer_vision/haralick_descriptors.py @@ -253,13 +253,13 @@ def matrix_concurrency(image: np.ndarray, coordinate: tuple[int, int]) -> np.nda def haralick_descriptors(matrix: np.ndarray) -> list[float]: - """Calculates all 8 Haralick descriptors based on co-occurence input matrix. + """Calculates all 8 Haralick descriptors based on co-occurrence input matrix. All descriptors are as follows: Maximum probability, Inverse Difference, Homogeneity, Entropy, Energy, Dissimilarity, Contrast and Correlation Args: - matrix: Co-occurence matrix to use as base for calculating descriptors. + matrix: Co-occurrence matrix to use as base for calculating descriptors. Returns: Reverse ordered list of resulting descriptors diff --git a/data_structures/arrays/equilibrium_index_in_array.py b/data_structures/arrays/equilibrium_index_in_array.py index 4099896d2..8802db620 100644 --- a/data_structures/arrays/equilibrium_index_in_array.py +++ b/data_structures/arrays/equilibrium_index_in_array.py @@ -2,7 +2,7 @@ Find the Equilibrium Index of an Array. Reference: https://www.geeksforgeeks.org/equilibrium-index-of-an-array/ -Python doctests can be run with the following command: +Python doctest can be run with the following command: python -m doctest -v equilibrium_index.py Given a sequence arr[] of size n, this function returns @@ -20,35 +20,34 @@ Output: 3 """ -def equilibrium_index(arr: list[int], size: int) -> int: +def equilibrium_index(arr: list[int]) -> int: """ Find the equilibrium index of an array. Args: - arr : The input array of integers. - size : The size of the array. + arr (list[int]): The input array of integers. Returns: int: The equilibrium index or -1 if no equilibrium index exists. Examples: - >>> equilibrium_index([-7, 1, 5, 2, -4, 3, 0], 7) + >>> equilibrium_index([-7, 1, 5, 2, -4, 3, 0]) 3 - >>> equilibrium_index([1, 2, 3, 4, 5], 5) + >>> equilibrium_index([1, 2, 3, 4, 5]) -1 - >>> equilibrium_index([1, 1, 1, 1, 1], 5) + >>> equilibrium_index([1, 1, 1, 1, 1]) 2 - >>> equilibrium_index([2, 4, 6, 8, 10, 3], 6) + >>> equilibrium_index([2, 4, 6, 8, 10, 3]) -1 """ total_sum = sum(arr) left_sum = 0 - for i in range(size): - total_sum -= arr[i] + for i, value in enumerate(arr): + total_sum -= value if left_sum == total_sum: return i - left_sum += arr[i] + left_sum += value return -1 diff --git a/data_structures/arrays/index_2d_array_in_1d.py b/data_structures/arrays/index_2d_array_in_1d.py new file mode 100644 index 000000000..27a9fa5f9 --- /dev/null +++ b/data_structures/arrays/index_2d_array_in_1d.py @@ -0,0 +1,105 @@ +""" +Retrieves the value of an 0-indexed 1D index from a 2D array. +There are two ways to retrieve value(s): + +1. Index2DArrayIterator(matrix) -> Iterator[int] +This iterator allows you to iterate through a 2D array by passing in the matrix and +calling next(your_iterator). You can also use the iterator in a loop. +Examples: +list(Index2DArrayIterator(matrix)) +set(Index2DArrayIterator(matrix)) +tuple(Index2DArrayIterator(matrix)) +sum(Index2DArrayIterator(matrix)) +-5 in Index2DArrayIterator(matrix) + +2. index_2d_array_in_1d(array: list[int], index: int) -> int +This function allows you to provide a 2D array and a 0-indexed 1D integer index, +and retrieves the integer value at that index. + +Python doctests can be run using this command: +python3 -m doctest -v index_2d_array_in_1d.py +""" + +from collections.abc import Iterator +from dataclasses import dataclass + + +@dataclass +class Index2DArrayIterator: + matrix: list[list[int]] + + def __iter__(self) -> Iterator[int]: + """ + >>> tuple(Index2DArrayIterator([[5], [-523], [-1], [34], [0]])) + (5, -523, -1, 34, 0) + >>> tuple(Index2DArrayIterator([[5, -523, -1], [34, 0]])) + (5, -523, -1, 34, 0) + >>> tuple(Index2DArrayIterator([[5, -523, -1, 34, 0]])) + (5, -523, -1, 34, 0) + >>> t = Index2DArrayIterator([[5, 2, 25], [23, 14, 5], [324, -1, 0]]) + >>> tuple(t) + (5, 2, 25, 23, 14, 5, 324, -1, 0) + >>> list(t) + [5, 2, 25, 23, 14, 5, 324, -1, 0] + >>> sorted(t) + [-1, 0, 2, 5, 5, 14, 23, 25, 324] + >>> tuple(t)[3] + 23 + >>> sum(t) + 397 + >>> -1 in t + True + >>> t = iter(Index2DArrayIterator([[5], [-523], [-1], [34], [0]])) + >>> next(t) + 5 + >>> next(t) + -523 + """ + for row in self.matrix: + yield from row + + +def index_2d_array_in_1d(array: list[list[int]], index: int) -> int: + """ + Retrieves the value of the one-dimensional index from a two-dimensional array. + + Args: + array: A 2D array of integers where all rows are the same size and all + columns are the same size. + index: A 1D index. + + Returns: + int: The 0-indexed value of the 1D index in the array. + + Examples: + >>> index_2d_array_in_1d([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], 5) + 5 + >>> index_2d_array_in_1d([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], -1) + Traceback (most recent call last): + ... + ValueError: index out of range + >>> index_2d_array_in_1d([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], 12) + Traceback (most recent call last): + ... + ValueError: index out of range + >>> index_2d_array_in_1d([[]], 0) + Traceback (most recent call last): + ... + ValueError: no items in array + """ + rows = len(array) + cols = len(array[0]) + + if rows == 0 or cols == 0: + raise ValueError("no items in array") + + if index < 0 or index >= rows * cols: + raise ValueError("index out of range") + + return array[index // cols][index % cols] + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/data_structures/arrays/kth_largest_element.py b/data_structures/arrays/kth_largest_element.py new file mode 100644 index 000000000..f25cc68e9 --- /dev/null +++ b/data_structures/arrays/kth_largest_element.py @@ -0,0 +1,117 @@ +""" +Given an array of integers and an integer k, find the kth largest element in the array. + +https://stackoverflow.com/questions/251781 +""" + + +def partition(arr: list[int], low: int, high: int) -> int: + """ + Partitions list based on the pivot element. + + This function rearranges the elements in the input list 'elements' such that + all elements greater than or equal to the chosen pivot are on the right side + of the pivot, and all elements smaller than the pivot are on the left side. + + Args: + arr: The list to be partitioned + low: The lower index of the list + high: The higher index of the list + + Returns: + int: The index of pivot element after partitioning + + Examples: + >>> partition([3, 1, 4, 5, 9, 2, 6, 5, 3, 5], 0, 9) + 4 + >>> partition([7, 1, 4, 5, 9, 2, 6, 5, 8], 0, 8) + 1 + >>> partition(['apple', 'cherry', 'date', 'banana'], 0, 3) + 2 + >>> partition([3.1, 1.2, 5.6, 4.7], 0, 3) + 1 + """ + pivot = arr[high] + i = low - 1 + for j in range(low, high): + if arr[j] >= pivot: + i += 1 + arr[i], arr[j] = arr[j], arr[i] + arr[i + 1], arr[high] = arr[high], arr[i + 1] + return i + 1 + + +def kth_largest_element(arr: list[int], position: int) -> int: + """ + Finds the kth largest element in a list. + Should deliver similar results to: + ```python + def kth_largest_element(arr, position): + return sorted(arr)[-position] + ``` + + Args: + nums: The list of numbers. + k: The position of the desired kth largest element. + + Returns: + int: The kth largest element. + + Examples: + >>> kth_largest_element([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5], 3) + 5 + >>> kth_largest_element([2, 5, 6, 1, 9, 3, 8, 4, 7, 3, 5], 1) + 9 + >>> kth_largest_element([2, 5, 6, 1, 9, 3, 8, 4, 7, 3, 5], -2) + Traceback (most recent call last): + ... + ValueError: Invalid value of 'position' + >>> kth_largest_element([9, 1, 3, 6, 7, 9, 8, 4, 2, 4, 9], 110) + Traceback (most recent call last): + ... + ValueError: Invalid value of 'position' + >>> kth_largest_element([1, 2, 4, 3, 5, 9, 7, 6, 5, 9, 3], 0) + Traceback (most recent call last): + ... + ValueError: Invalid value of 'position' + >>> kth_largest_element(['apple', 'cherry', 'date', 'banana'], 2) + 'cherry' + >>> kth_largest_element([3.1, 1.2, 5.6, 4.7,7.9,5,0], 2) + 5.6 + >>> kth_largest_element([-2, -5, -4, -1], 1) + -1 + >>> kth_largest_element([], 1) + -1 + >>> kth_largest_element([3.1, 1.2, 5.6, 4.7, 7.9, 5, 0], 1.5) + Traceback (most recent call last): + ... + ValueError: The position should be an integer + >>> kth_largest_element((4, 6, 1, 2), 4) + Traceback (most recent call last): + ... + TypeError: 'tuple' object does not support item assignment + """ + if not arr: + return -1 + if not isinstance(position, int): + raise ValueError("The position should be an integer") + if not 1 <= position <= len(arr): + raise ValueError("Invalid value of 'position'") + low, high = 0, len(arr) - 1 + while low <= high: + if low > len(arr) - 1 or high < 0: + return -1 + pivot_index = partition(arr, low, high) + if pivot_index == position - 1: + return arr[pivot_index] + elif pivot_index > position - 1: + high = pivot_index - 1 + else: + low = pivot_index + 1 + return -1 + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/data_structures/arrays/monotonic_array.py b/data_structures/arrays/monotonic_array.py new file mode 100644 index 000000000..c50a21530 --- /dev/null +++ b/data_structures/arrays/monotonic_array.py @@ -0,0 +1,23 @@ +# https://leetcode.com/problems/monotonic-array/ +def is_monotonic(nums: list[int]) -> bool: + """ + Check if a list is monotonic. + + >>> is_monotonic([1, 2, 2, 3]) + True + >>> is_monotonic([6, 5, 4, 4]) + True + >>> is_monotonic([1, 3, 2]) + False + """ + return all(nums[i] <= nums[i + 1] for i in range(len(nums) - 1)) or all( + nums[i] >= nums[i + 1] for i in range(len(nums) - 1) + ) + + +# Test the function with your examples +if __name__ == "__main__": + # Test the function with your examples + print(is_monotonic([1, 2, 2, 3])) # Output: True + print(is_monotonic([6, 5, 4, 4])) # Output: True + print(is_monotonic([1, 3, 2])) # Output: False diff --git a/data_structures/binary_tree/binary_search_tree.py b/data_structures/binary_tree/binary_search_tree.py index a706d21e3..38691c475 100644 --- a/data_structures/binary_tree/binary_search_tree.py +++ b/data_structures/binary_tree/binary_search_tree.py @@ -14,6 +14,16 @@ Example >>> t.insert(8, 3, 6, 1, 10, 14, 13, 4, 7) >>> print(" ".join(repr(i.value) for i in t.traversal_tree())) 8 3 1 6 4 7 10 14 13 + +>>> tuple(i.value for i in t.traversal_tree(inorder)) +(1, 3, 4, 6, 7, 8, 10, 13, 14) +>>> tuple(t) +(1, 3, 4, 6, 7, 8, 10, 13, 14) +>>> t.find_kth_smallest(3, t.root) +4 +>>> tuple(t)[3-1] +4 + >>> print(" ".join(repr(i.value) for i in t.traversal_tree(postorder))) 1 4 7 6 3 13 14 10 8 >>> t.remove(20) @@ -39,8 +49,12 @@ Prints all the elements of the list in order traversal Test existence >>> t.search(6) is not None True +>>> 6 in t +True >>> t.search(-1) is not None False +>>> -1 in t +False >>> t.search(6).is_right True @@ -49,26 +63,47 @@ False >>> t.get_max().value 14 +>>> max(t) +14 >>> t.get_min().value 1 +>>> min(t) +1 >>> t.empty() False +>>> not t +False >>> for i in testlist: ... t.remove(i) >>> t.empty() True +>>> not t +True """ +from __future__ import annotations -from collections.abc import Iterable +from collections.abc import Iterable, Iterator +from dataclasses import dataclass from typing import Any +@dataclass class Node: - def __init__(self, value: int | None = None): - self.value = value - self.parent: Node | None = None # Added in order to delete a node easier - self.left: Node | None = None - self.right: Node | None = None + value: int + left: Node | None = None + right: Node | None = None + parent: Node | None = None # Added in order to delete a node easier + + def __iter__(self) -> Iterator[int]: + """ + >>> list(Node(0)) + [0] + >>> list(Node(0, Node(-1), Node(1), None)) + [-1, 0, 1] + """ + yield from self.left or [] + yield self.value + yield from self.right or [] def __repr__(self) -> str: from pprint import pformat @@ -79,12 +114,18 @@ class Node: @property def is_right(self) -> bool: - return self.parent is not None and self is self.parent.right + return bool(self.parent and self is self.parent.right) +@dataclass class BinarySearchTree: - def __init__(self, root: Node | None = None): - self.root = root + root: Node | None = None + + def __bool__(self) -> bool: + return bool(self.root) + + def __iter__(self) -> Iterator[int]: + yield from self.root or [] def __str__(self) -> str: """ @@ -227,6 +268,16 @@ class BinarySearchTree: return arr[k - 1] +def inorder(curr_node: Node | None) -> list[Node]: + """ + inorder (left, self, right) + """ + node_list = [] + if curr_node is not None: + node_list = inorder(curr_node.left) + [curr_node] + inorder(curr_node.right) + return node_list + + def postorder(curr_node: Node | None) -> list[Node]: """ postOrder (left, right, self) diff --git a/data_structures/hashing/double_hash.py b/data_structures/hashing/double_hash.py index be21e74ca..76c6c8681 100644 --- a/data_structures/hashing/double_hash.py +++ b/data_structures/hashing/double_hash.py @@ -35,6 +35,33 @@ class DoubleHash(HashTable): return (increment * self.__hash_function_2(key, data)) % self.size_table def _collision_resolution(self, key, data=None): + """ + Examples: + + 1. Try to add three data elements when the size is three + >>> dh = DoubleHash(3) + >>> dh.insert_data(10) + >>> dh.insert_data(20) + >>> dh.insert_data(30) + >>> dh.keys() + {1: 10, 2: 20, 0: 30} + + 2. Try to add three data elements when the size is two + >>> dh = DoubleHash(2) + >>> dh.insert_data(10) + >>> dh.insert_data(20) + >>> dh.insert_data(30) + >>> dh.keys() + {10: 10, 9: 20, 8: 30} + + 3. Try to add three data elements when the size is four + >>> dh = DoubleHash(4) + >>> dh.insert_data(10) + >>> dh.insert_data(20) + >>> dh.insert_data(30) + >>> dh.keys() + {9: 20, 10: 10, 8: 30} + """ i = 1 new_key = self.hash_function(data) @@ -50,3 +77,9 @@ class DoubleHash(HashTable): i += 1 return new_key + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/data_structures/hashing/hash_table.py b/data_structures/hashing/hash_table.py index 7ca2f7c40..7fe57068f 100644 --- a/data_structures/hashing/hash_table.py +++ b/data_structures/hashing/hash_table.py @@ -21,6 +21,29 @@ class HashTable: self._keys: dict = {} def keys(self): + """ + The keys function returns a dictionary containing the key value pairs. + key being the index number in hash table and value being the data value. + + Examples: + 1. creating HashTable with size 10 and inserting 3 elements + >>> ht = HashTable(10) + >>> ht.insert_data(10) + >>> ht.insert_data(20) + >>> ht.insert_data(30) + >>> ht.keys() + {0: 10, 1: 20, 2: 30} + + 2. creating HashTable with size 5 and inserting 5 elements + >>> ht = HashTable(5) + >>> ht.insert_data(5) + >>> ht.insert_data(4) + >>> ht.insert_data(3) + >>> ht.insert_data(2) + >>> ht.insert_data(1) + >>> ht.keys() + {0: 5, 4: 4, 3: 3, 2: 2, 1: 1} + """ return self._keys def balanced_factor(self): @@ -29,6 +52,30 @@ class HashTable: ) def hash_function(self, key): + """ + Generates hash for the given key value + + Examples: + + Creating HashTable with size 5 + >>> ht = HashTable(5) + >>> ht.hash_function(10) + 0 + >>> ht.hash_function(20) + 0 + >>> ht.hash_function(4) + 4 + >>> ht.hash_function(18) + 3 + >>> ht.hash_function(-18) + 2 + >>> ht.hash_function(18.5) + 3.5 + >>> ht.hash_function(0) + 0 + >>> ht.hash_function(-0) + 0 + """ return key % self.size_table def _step_by_step(self, step_ord): @@ -37,6 +84,43 @@ class HashTable: print(self.values) def bulk_insert(self, values): + """ + bulk_insert is used for entering more than one element at a time + in the HashTable. + + Examples: + 1. + >>> ht = HashTable(5) + >>> ht.bulk_insert((10,20,30)) + step 1 + [0, 1, 2, 3, 4] + [10, None, None, None, None] + step 2 + [0, 1, 2, 3, 4] + [10, 20, None, None, None] + step 3 + [0, 1, 2, 3, 4] + [10, 20, 30, None, None] + + 2. + >>> ht = HashTable(5) + >>> ht.bulk_insert([5,4,3,2,1]) + step 1 + [0, 1, 2, 3, 4] + [5, None, None, None, None] + step 2 + [0, 1, 2, 3, 4] + [5, None, None, None, 4] + step 3 + [0, 1, 2, 3, 4] + [5, None, None, 3, 4] + step 4 + [0, 1, 2, 3, 4] + [5, None, 2, 3, 4] + step 5 + [0, 1, 2, 3, 4] + [5, 1, 2, 3, 4] + """ i = 1 self.__aux_list = values for value in values: @@ -45,10 +129,99 @@ class HashTable: i += 1 def _set_value(self, key, data): + """ + _set_value functions allows to update value at a particular hash + + Examples: + 1. _set_value in HashTable of size 5 + >>> ht = HashTable(5) + >>> ht.insert_data(10) + >>> ht.insert_data(20) + >>> ht.insert_data(30) + >>> ht._set_value(0,15) + >>> ht.keys() + {0: 15, 1: 20, 2: 30} + + 2. _set_value in HashTable of size 2 + >>> ht = HashTable(2) + >>> ht.insert_data(17) + >>> ht.insert_data(18) + >>> ht.insert_data(99) + >>> ht._set_value(3,15) + >>> ht.keys() + {3: 15, 2: 17, 4: 99} + + 3. _set_value in HashTable when hash is not present + >>> ht = HashTable(2) + >>> ht.insert_data(17) + >>> ht.insert_data(18) + >>> ht.insert_data(99) + >>> ht._set_value(0,15) + >>> ht.keys() + {3: 18, 2: 17, 4: 99, 0: 15} + + 4. _set_value in HashTable when multiple hash are not present + >>> ht = HashTable(2) + >>> ht.insert_data(17) + >>> ht.insert_data(18) + >>> ht.insert_data(99) + >>> ht._set_value(0,15) + >>> ht._set_value(1,20) + >>> ht.keys() + {3: 18, 2: 17, 4: 99, 0: 15, 1: 20} + """ self.values[key] = data self._keys[key] = data def _collision_resolution(self, key, data=None): + """ + This method is a type of open addressing which is used for handling collision. + + In this implementation the concept of linear probing has been used. + + The hash table is searched sequentially from the original location of the + hash, if the new hash/location we get is already occupied we check for the next + hash/location. + + references: + - https://en.wikipedia.org/wiki/Linear_probing + + Examples: + 1. The collision will be with keys 18 & 99, so new hash will be created for 99 + >>> ht = HashTable(3) + >>> ht.insert_data(17) + >>> ht.insert_data(18) + >>> ht.insert_data(99) + >>> ht.keys() + {2: 17, 0: 18, 1: 99} + + 2. The collision will be with keys 17 & 101, so new hash + will be created for 101 + >>> ht = HashTable(4) + >>> ht.insert_data(17) + >>> ht.insert_data(18) + >>> ht.insert_data(99) + >>> ht.insert_data(101) + >>> ht.keys() + {1: 17, 2: 18, 3: 99, 0: 101} + + 2. The collision will be with all keys, so new hash will be created for all + >>> ht = HashTable(1) + >>> ht.insert_data(17) + >>> ht.insert_data(18) + >>> ht.insert_data(99) + >>> ht.keys() + {2: 17, 3: 18, 4: 99} + + 3. Trying to insert float key in hash + >>> ht = HashTable(1) + >>> ht.insert_data(17) + >>> ht.insert_data(18) + >>> ht.insert_data(99.99) + Traceback (most recent call last): + ... + TypeError: list indices must be integers or slices, not float + """ new_key = self.hash_function(key + 1) while self.values[new_key] is not None and self.values[new_key] != key: @@ -69,6 +242,21 @@ class HashTable: self.insert_data(value) def insert_data(self, data): + """ + insert_data is used for inserting a single element at a time in the HashTable. + + Examples: + + >>> ht = HashTable(3) + >>> ht.insert_data(5) + >>> ht.keys() + {2: 5} + >>> ht = HashTable(5) + >>> ht.insert_data(30) + >>> ht.insert_data(50) + >>> ht.keys() + {0: 30, 1: 50} + """ key = self.hash_function(data) if self.values[key] is None: @@ -84,3 +272,9 @@ class HashTable: else: self.rehashing() self.insert_data(data) + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/data_structures/hashing/quadratic_probing.py b/data_structures/hashing/quadratic_probing.py index 0930340a3..2f3401ec8 100644 --- a/data_structures/hashing/quadratic_probing.py +++ b/data_structures/hashing/quadratic_probing.py @@ -12,6 +12,55 @@ class QuadraticProbing(HashTable): super().__init__(*args, **kwargs) def _collision_resolution(self, key, data=None): + """ + Quadratic probing is an open addressing scheme used for resolving + collisions in hash table. + + It works by taking the original hash index and adding successive + values of an arbitrary quadratic polynomial until open slot is found. + + Hash + 1², Hash + 2², Hash + 3² .... Hash + n² + + reference: + - https://en.wikipedia.org/wiki/Quadratic_probing + e.g: + 1. Create hash table with size 7 + >>> qp = QuadraticProbing(7) + >>> qp.insert_data(90) + >>> qp.insert_data(340) + >>> qp.insert_data(24) + >>> qp.insert_data(45) + >>> qp.insert_data(99) + >>> qp.insert_data(73) + >>> qp.insert_data(7) + >>> qp.keys() + {11: 45, 14: 99, 7: 24, 0: 340, 5: 73, 6: 90, 8: 7} + + 2. Create hash table with size 8 + >>> qp = QuadraticProbing(8) + >>> qp.insert_data(0) + >>> qp.insert_data(999) + >>> qp.insert_data(111) + >>> qp.keys() + {0: 0, 7: 999, 3: 111} + + 3. Try to add three data elements when the size is two + >>> qp = QuadraticProbing(2) + >>> qp.insert_data(0) + >>> qp.insert_data(999) + >>> qp.insert_data(111) + >>> qp.keys() + {0: 0, 4: 999, 1: 111} + + 4. Try to add three data elements when the size is one + >>> qp = QuadraticProbing(1) + >>> qp.insert_data(0) + >>> qp.insert_data(999) + >>> qp.insert_data(111) + >>> qp.keys() + {4: 999, 1: 111} + """ + i = 1 new_key = self.hash_function(key + i * i) @@ -27,3 +76,9 @@ class QuadraticProbing(HashTable): break return new_key + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/electronics/capacitor_equivalence.py b/electronics/capacitor_equivalence.py new file mode 100644 index 000000000..274b18afb --- /dev/null +++ b/electronics/capacitor_equivalence.py @@ -0,0 +1,53 @@ +# https://farside.ph.utexas.edu/teaching/316/lectures/node46.html + +from __future__ import annotations + + +def capacitor_parallel(capacitors: list[float]) -> float: + """ + Ceq = C1 + C2 + ... + Cn + Calculate the equivalent resistance for any number of capacitors in parallel. + >>> capacitor_parallel([5.71389, 12, 3]) + 20.71389 + >>> capacitor_parallel([5.71389, 12, -3]) + Traceback (most recent call last): + ... + ValueError: Capacitor at index 2 has a negative value! + """ + sum_c = 0.0 + for index, capacitor in enumerate(capacitors): + if capacitor < 0: + msg = f"Capacitor at index {index} has a negative value!" + raise ValueError(msg) + sum_c += capacitor + return sum_c + + +def capacitor_series(capacitors: list[float]) -> float: + """ + Ceq = 1/ (1/C1 + 1/C2 + ... + 1/Cn) + >>> capacitor_series([5.71389, 12, 3]) + 1.6901062252507735 + >>> capacitor_series([5.71389, 12, -3]) + Traceback (most recent call last): + ... + ValueError: Capacitor at index 2 has a negative or zero value! + >>> capacitor_series([5.71389, 12, 0.000]) + Traceback (most recent call last): + ... + ValueError: Capacitor at index 2 has a negative or zero value! + """ + + first_sum = 0.0 + for index, capacitor in enumerate(capacitors): + if capacitor <= 0: + msg = f"Capacitor at index {index} has a negative or zero value!" + raise ValueError(msg) + first_sum += 1 / capacitor + return 1 / first_sum + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/graphs/tarjans_scc.py b/graphs/tarjans_scc.py index dfd2e5270..a75dc4d2c 100644 --- a/graphs/tarjans_scc.py +++ b/graphs/tarjans_scc.py @@ -1,7 +1,7 @@ from collections import deque -def tarjan(g): +def tarjan(g: list[list[int]]) -> list[list[int]]: """ Tarjan's algo for finding strongly connected components in a directed graph @@ -19,15 +19,30 @@ def tarjan(g): Complexity: strong_connect() is called at most once for each node and has a complexity of O(|E|) as it is DFS. Therefore this has complexity O(|V| + |E|) for a graph G = (V, E) + + >>> tarjan([[2, 3, 4], [2, 3, 4], [0, 1, 3], [0, 1, 2], [1]]) + [[4, 3, 1, 2, 0]] + >>> tarjan([[], [], [], []]) + [[0], [1], [2], [3]] + >>> a = [0, 1, 2, 3, 4, 5, 4] + >>> b = [1, 0, 3, 2, 5, 4, 0] + >>> n = 7 + >>> sorted(tarjan(create_graph(n, list(zip(a, b))))) == sorted( + ... tarjan(create_graph(n, list(zip(a[::-1], b[::-1]))))) + True + >>> a = [0, 1, 2, 3, 4, 5, 6] + >>> b = [0, 1, 2, 3, 4, 5, 6] + >>> sorted(tarjan(create_graph(n, list(zip(a, b))))) + [[0], [1], [2], [3], [4], [5], [6]] """ n = len(g) - stack = deque() + stack: deque[int] = deque() on_stack = [False for _ in range(n)] index_of = [-1 for _ in range(n)] lowlink_of = index_of[:] - def strong_connect(v, index, components): + def strong_connect(v: int, index: int, components: list[list[int]]) -> int: index_of[v] = index # the number when this node is seen lowlink_of[v] = index # lowest rank node reachable from here index += 1 @@ -57,7 +72,7 @@ def tarjan(g): components.append(component) return index - components = [] + components: list[list[int]] = [] for v in range(n): if index_of[v] == -1: strong_connect(v, 0, components) @@ -65,8 +80,16 @@ def tarjan(g): return components -def create_graph(n, edges): - g = [[] for _ in range(n)] +def create_graph(n: int, edges: list[tuple[int, int]]) -> list[list[int]]: + """ + >>> n = 7 + >>> source = [0, 0, 1, 2, 3, 3, 4, 4, 6] + >>> target = [1, 3, 2, 0, 1, 4, 5, 6, 5] + >>> edges = list(zip(source, target)) + >>> create_graph(n, edges) + [[1, 3], [2], [0], [1, 4], [5, 6], [], [5]] + """ + g: list[list[int]] = [[] for _ in range(n)] for u, v in edges: g[u].append(v) return g diff --git a/machine_learning/automatic_differentiation.py b/machine_learning/automatic_differentiation.py new file mode 100644 index 000000000..cd2e5cdaa --- /dev/null +++ b/machine_learning/automatic_differentiation.py @@ -0,0 +1,327 @@ +""" +Demonstration of the Automatic Differentiation (Reverse mode). + +Reference: https://en.wikipedia.org/wiki/Automatic_differentiation + +Author: Poojan Smart +Email: smrtpoojan@gmail.com +""" +from __future__ import annotations + +from collections import defaultdict +from enum import Enum +from types import TracebackType +from typing import Any + +import numpy as np +from typing_extensions import Self # noqa: UP035 + + +class OpType(Enum): + """ + Class represents list of supported operations on Variable for gradient calculation. + """ + + ADD = 0 + SUB = 1 + MUL = 2 + DIV = 3 + MATMUL = 4 + POWER = 5 + NOOP = 6 + + +class Variable: + """ + Class represents n-dimensional object which is used to wrap numpy array on which + operations will be performed and the gradient will be calculated. + + Examples: + >>> Variable(5.0) + Variable(5.0) + >>> Variable([5.0, 2.9]) + Variable([5. 2.9]) + >>> Variable([5.0, 2.9]) + Variable([1.0, 5.5]) + Variable([6. 8.4]) + >>> Variable([[8.0, 10.0]]) + Variable([[ 8. 10.]]) + """ + + def __init__(self, value: Any) -> None: + self.value = np.array(value) + + # pointers to the operations to which the Variable is input + self.param_to: list[Operation] = [] + # pointer to the operation of which the Variable is output of + self.result_of: Operation = Operation(OpType.NOOP) + + def __repr__(self) -> str: + return f"Variable({self.value})" + + def to_ndarray(self) -> np.ndarray: + return self.value + + def __add__(self, other: Variable) -> Variable: + result = Variable(self.value + other.value) + + with GradientTracker() as tracker: + # if tracker is enabled, computation graph will be updated + if tracker.enabled: + tracker.append(OpType.ADD, params=[self, other], output=result) + return result + + def __sub__(self, other: Variable) -> Variable: + result = Variable(self.value - other.value) + + with GradientTracker() as tracker: + # if tracker is enabled, computation graph will be updated + if tracker.enabled: + tracker.append(OpType.SUB, params=[self, other], output=result) + return result + + def __mul__(self, other: Variable) -> Variable: + result = Variable(self.value * other.value) + + with GradientTracker() as tracker: + # if tracker is enabled, computation graph will be updated + if tracker.enabled: + tracker.append(OpType.MUL, params=[self, other], output=result) + return result + + def __truediv__(self, other: Variable) -> Variable: + result = Variable(self.value / other.value) + + with GradientTracker() as tracker: + # if tracker is enabled, computation graph will be updated + if tracker.enabled: + tracker.append(OpType.DIV, params=[self, other], output=result) + return result + + def __matmul__(self, other: Variable) -> Variable: + result = Variable(self.value @ other.value) + + with GradientTracker() as tracker: + # if tracker is enabled, computation graph will be updated + if tracker.enabled: + tracker.append(OpType.MATMUL, params=[self, other], output=result) + return result + + def __pow__(self, power: int) -> Variable: + result = Variable(self.value**power) + + with GradientTracker() as tracker: + # if tracker is enabled, computation graph will be updated + if tracker.enabled: + tracker.append( + OpType.POWER, + params=[self], + output=result, + other_params={"power": power}, + ) + return result + + def add_param_to(self, param_to: Operation) -> None: + self.param_to.append(param_to) + + def add_result_of(self, result_of: Operation) -> None: + self.result_of = result_of + + +class Operation: + """ + Class represents operation between single or two Variable objects. + Operation objects contains type of operation, pointers to input Variable + objects and pointer to resulting Variable from the operation. + """ + + def __init__( + self, + op_type: OpType, + other_params: dict | None = None, + ) -> None: + self.op_type = op_type + self.other_params = {} if other_params is None else other_params + + def add_params(self, params: list[Variable]) -> None: + self.params = params + + def add_output(self, output: Variable) -> None: + self.output = output + + def __eq__(self, value) -> bool: + return self.op_type == value if isinstance(value, OpType) else False + + +class GradientTracker: + """ + Class contains methods to compute partial derivatives of Variable + based on the computation graph. + + Examples: + + >>> with GradientTracker() as tracker: + ... a = Variable([2.0, 5.0]) + ... b = Variable([1.0, 2.0]) + ... m = Variable([1.0, 2.0]) + ... c = a + b + ... d = a * b + ... e = c / d + >>> tracker.gradient(e, a) + array([-0.25, -0.04]) + >>> tracker.gradient(e, b) + array([-1. , -0.25]) + >>> tracker.gradient(e, m) is None + True + + >>> with GradientTracker() as tracker: + ... a = Variable([[2.0, 5.0]]) + ... b = Variable([[1.0], [2.0]]) + ... c = a @ b + >>> tracker.gradient(c, a) + array([[1., 2.]]) + >>> tracker.gradient(c, b) + array([[2.], + [5.]]) + + >>> with GradientTracker() as tracker: + ... a = Variable([[2.0, 5.0]]) + ... b = a ** 3 + >>> tracker.gradient(b, a) + array([[12., 75.]]) + """ + + instance = None + + def __new__(cls) -> Self: + """ + Executes at the creation of class object and returns if + object is already created. This class follows singleton + design pattern. + """ + if cls.instance is None: + cls.instance = super().__new__(cls) + return cls.instance + + def __init__(self) -> None: + self.enabled = False + + def __enter__(self) -> Self: + self.enabled = True + return self + + def __exit__( + self, + exc_type: type[BaseException] | None, + exc: BaseException | None, + traceback: TracebackType | None, + ) -> None: + self.enabled = False + + def append( + self, + op_type: OpType, + params: list[Variable], + output: Variable, + other_params: dict | None = None, + ) -> None: + """ + Adds Operation object to the related Variable objects for + creating computational graph for calculating gradients. + + Args: + op_type: Operation type + params: Input parameters to the operation + output: Output variable of the operation + """ + operation = Operation(op_type, other_params=other_params) + param_nodes = [] + for param in params: + param.add_param_to(operation) + param_nodes.append(param) + output.add_result_of(operation) + + operation.add_params(param_nodes) + operation.add_output(output) + + def gradient(self, target: Variable, source: Variable) -> np.ndarray | None: + """ + Reverse accumulation of partial derivatives to calculate gradients + of target variable with respect to source variable. + + Args: + target: target variable for which gradients are calculated. + source: source variable with respect to which the gradients are + calculated. + + Returns: + Gradient of the source variable with respect to the target variable + """ + + # partial derivatives with respect to target + partial_deriv = defaultdict(lambda: 0) + partial_deriv[target] = np.ones_like(target.to_ndarray()) + + # iterating through each operations in the computation graph + operation_queue = [target.result_of] + while len(operation_queue) > 0: + operation = operation_queue.pop() + for param in operation.params: + # as per the chain rule, multiplying partial derivatives + # of variables with respect to the target + dparam_doutput = self.derivative(param, operation) + dparam_dtarget = dparam_doutput * partial_deriv[operation.output] + partial_deriv[param] += dparam_dtarget + + if param.result_of and param.result_of != OpType.NOOP: + operation_queue.append(param.result_of) + + return partial_deriv.get(source) + + def derivative(self, param: Variable, operation: Operation) -> np.ndarray: + """ + Compute the derivative of given operation/function + + Args: + param: variable to be differentiated + operation: function performed on the input variable + + Returns: + Derivative of input variable with respect to the output of + the operation + """ + params = operation.params + + if operation == OpType.ADD: + return np.ones_like(params[0].to_ndarray(), dtype=np.float64) + if operation == OpType.SUB: + if params[0] == param: + return np.ones_like(params[0].to_ndarray(), dtype=np.float64) + return -np.ones_like(params[1].to_ndarray(), dtype=np.float64) + if operation == OpType.MUL: + return ( + params[1].to_ndarray().T + if params[0] == param + else params[0].to_ndarray().T + ) + if operation == OpType.DIV: + if params[0] == param: + return 1 / params[1].to_ndarray() + return -params[0].to_ndarray() / (params[1].to_ndarray() ** 2) + if operation == OpType.MATMUL: + return ( + params[1].to_ndarray().T + if params[0] == param + else params[0].to_ndarray().T + ) + if operation == OpType.POWER: + power = operation.other_params["power"] + return power * (params[0].to_ndarray() ** (power - 1)) + + err_msg = f"invalid operation type: {operation.op_type}" + raise ValueError(err_msg) + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/machine_learning/logistic_regression.py b/machine_learning/logistic_regression.py index f9da0104a..59a70fd65 100644 --- a/machine_learning/logistic_regression.py +++ b/machine_learning/logistic_regression.py @@ -27,7 +27,7 @@ from sklearn import datasets # classification problems -def sigmoid_function(z): +def sigmoid_function(z: float | np.ndarray) -> float | np.ndarray: """ Also known as Logistic Function. @@ -42,11 +42,63 @@ def sigmoid_function(z): @param z: input to the function @returns: returns value in the range 0 to 1 + + Examples: + >>> sigmoid_function(4) + 0.9820137900379085 + >>> sigmoid_function(np.array([-3, 3])) + array([0.04742587, 0.95257413]) + >>> sigmoid_function(np.array([-3, 3, 1])) + array([0.04742587, 0.95257413, 0.73105858]) + >>> sigmoid_function(np.array([-0.01, -2, -1.9])) + array([0.49750002, 0.11920292, 0.13010847]) + >>> sigmoid_function(np.array([-1.3, 5.3, 12])) + array([0.21416502, 0.9950332 , 0.99999386]) + >>> sigmoid_function(np.array([0.01, 0.02, 4.1])) + array([0.50249998, 0.50499983, 0.9836975 ]) + >>> sigmoid_function(np.array([0.8])) + array([0.68997448]) """ return 1 / (1 + np.exp(-z)) -def cost_function(h, y): +def cost_function(h: np.ndarray, y: np.ndarray) -> float: + """ + Cost function quantifies the error between predicted and expected values. + The cost function used in Logistic Regression is called Log Loss + or Cross Entropy Function. + + J(θ) = (1/m) * Σ [ -y * log(hθ(x)) - (1 - y) * log(1 - hθ(x)) ] + + Where: + - J(θ) is the cost that we want to minimize during training + - m is the number of training examples + - Σ represents the summation over all training examples + - y is the actual binary label (0 or 1) for a given example + - hθ(x) is the predicted probability that x belongs to the positive class + + @param h: the output of sigmoid function. It is the estimated probability + that the input example 'x' belongs to the positive class + + @param y: the actual binary label associated with input example 'x' + + Examples: + >>> estimations = sigmoid_function(np.array([0.3, -4.3, 8.1])) + >>> cost_function(h=estimations,y=np.array([1, 0, 1])) + 0.18937868932131605 + >>> estimations = sigmoid_function(np.array([4, 3, 1])) + >>> cost_function(h=estimations,y=np.array([1, 0, 0])) + 1.459999655669926 + >>> estimations = sigmoid_function(np.array([4, -3, -1])) + >>> cost_function(h=estimations,y=np.array([1,0,0])) + 0.1266663223365915 + >>> estimations = sigmoid_function(0) + >>> cost_function(h=estimations,y=np.array([1])) + 0.6931471805599453 + + References: + - https://en.wikipedia.org/wiki/Logistic_regression + """ return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean() @@ -75,6 +127,10 @@ def logistic_reg(alpha, x, y, max_iterations=70000): # In[68]: if __name__ == "__main__": + import doctest + + doctest.testmod() + iris = datasets.load_iris() x = iris.data[:, :2] y = (iris.target != 0) * 1 diff --git a/machine_learning/loss_functions.py b/machine_learning/loss_functions.py index ef3429636..36a760326 100644 --- a/machine_learning/loss_functions.py +++ b/machine_learning/loss_functions.py @@ -261,6 +261,43 @@ def mean_squared_error(y_true: np.ndarray, y_pred: np.ndarray) -> float: return np.mean(squared_errors) +def mean_absolute_error(y_true: np.ndarray, y_pred: np.ndarray) -> float: + """ + Calculates the Mean Absolute Error (MAE) between ground truth (observed) + and predicted values. + + MAE measures the absolute difference between true values and predicted values. + + Equation: + MAE = (1/n) * Σ(abs(y_true - y_pred)) + + Reference: https://en.wikipedia.org/wiki/Mean_absolute_error + + Parameters: + - y_true: The true values (ground truth) + - y_pred: The predicted values + + >>> true_values = np.array([1.0, 2.0, 3.0, 4.0, 5.0]) + >>> predicted_values = np.array([0.8, 2.1, 2.9, 4.2, 5.2]) + >>> np.isclose(mean_absolute_error(true_values, predicted_values), 0.16) + True + >>> true_values = np.array([1.0, 2.0, 3.0, 4.0, 5.0]) + >>> predicted_values = np.array([0.8, 2.1, 2.9, 4.2, 5.2]) + >>> np.isclose(mean_absolute_error(true_values, predicted_values), 2.16) + False + >>> true_labels = np.array([1.0, 2.0, 3.0, 4.0, 5.0]) + >>> predicted_probs = np.array([0.3, 0.8, 0.9, 5.2]) + >>> mean_absolute_error(true_labels, predicted_probs) + Traceback (most recent call last): + ... + ValueError: Input arrays must have the same length. + """ + if len(y_true) != len(y_pred): + raise ValueError("Input arrays must have the same length.") + + return np.mean(abs(y_true - y_pred)) + + def mean_squared_logarithmic_error(y_true: np.ndarray, y_pred: np.ndarray) -> float: """ Calculate the mean squared logarithmic error (MSLE) between ground truth and @@ -297,6 +334,143 @@ def mean_squared_logarithmic_error(y_true: np.ndarray, y_pred: np.ndarray) -> fl return np.mean(squared_logarithmic_errors) +def mean_absolute_percentage_error( + y_true: np.ndarray, y_pred: np.ndarray, epsilon: float = 1e-15 +) -> float: + """ + Calculate the Mean Absolute Percentage Error between y_true and y_pred. + + Mean Absolute Percentage Error calculates the average of the absolute + percentage differences between the predicted and true values. + + Formula = (Σ|y_true[i]-Y_pred[i]/y_true[i]|)/n + + Source: https://stephenallwright.com/good-mape-score/ + + Parameters: + y_true (np.ndarray): Numpy array containing true/target values. + y_pred (np.ndarray): Numpy array containing predicted values. + + Returns: + float: The Mean Absolute Percentage error between y_true and y_pred. + + Examples: + >>> y_true = np.array([10, 20, 30, 40]) + >>> y_pred = np.array([12, 18, 33, 45]) + >>> mean_absolute_percentage_error(y_true, y_pred) + 0.13125 + + >>> y_true = np.array([1, 2, 3, 4]) + >>> y_pred = np.array([2, 3, 4, 5]) + >>> mean_absolute_percentage_error(y_true, y_pred) + 0.5208333333333333 + + >>> y_true = np.array([34, 37, 44, 47, 48, 48, 46, 43, 32, 27, 26, 24]) + >>> y_pred = np.array([37, 40, 46, 44, 46, 50, 45, 44, 34, 30, 22, 23]) + >>> mean_absolute_percentage_error(y_true, y_pred) + 0.064671076436071 + """ + if len(y_true) != len(y_pred): + raise ValueError("The length of the two arrays should be the same.") + + y_true = np.where(y_true == 0, epsilon, y_true) + absolute_percentage_diff = np.abs((y_true - y_pred) / y_true) + + return np.mean(absolute_percentage_diff) + + +def perplexity_loss( + y_true: np.ndarray, y_pred: np.ndarray, epsilon: float = 1e-7 +) -> float: + """ + Calculate the perplexity for the y_true and y_pred. + + Compute the Perplexity which useful in predicting language model + accuracy in Natural Language Processing (NLP.) + Perplexity is measure of how certain the model in its predictions. + + Perplexity Loss = exp(-1/N (Σ ln(p(x))) + + Reference: + https://en.wikipedia.org/wiki/Perplexity + + Args: + y_true: Actual label encoded sentences of shape (batch_size, sentence_length) + y_pred: Predicted sentences of shape (batch_size, sentence_length, vocab_size) + epsilon: Small floating point number to avoid getting inf for log(0) + + Returns: + Perplexity loss between y_true and y_pred. + + >>> y_true = np.array([[1, 4], [2, 3]]) + >>> y_pred = np.array( + ... [[[0.28, 0.19, 0.21 , 0.15, 0.15], + ... [0.24, 0.19, 0.09, 0.18, 0.27]], + ... [[0.03, 0.26, 0.21, 0.18, 0.30], + ... [0.28, 0.10, 0.33, 0.15, 0.12]]] + ... ) + >>> perplexity_loss(y_true, y_pred) + 5.0247347775367945 + >>> y_true = np.array([[1, 4], [2, 3]]) + >>> y_pred = np.array( + ... [[[0.28, 0.19, 0.21 , 0.15, 0.15], + ... [0.24, 0.19, 0.09, 0.18, 0.27], + ... [0.30, 0.10, 0.20, 0.15, 0.25]], + ... [[0.03, 0.26, 0.21, 0.18, 0.30], + ... [0.28, 0.10, 0.33, 0.15, 0.12], + ... [0.30, 0.10, 0.20, 0.15, 0.25]],] + ... ) + >>> perplexity_loss(y_true, y_pred) + Traceback (most recent call last): + ... + ValueError: Sentence length of y_true and y_pred must be equal. + >>> y_true = np.array([[1, 4], [2, 11]]) + >>> y_pred = np.array( + ... [[[0.28, 0.19, 0.21 , 0.15, 0.15], + ... [0.24, 0.19, 0.09, 0.18, 0.27]], + ... [[0.03, 0.26, 0.21, 0.18, 0.30], + ... [0.28, 0.10, 0.33, 0.15, 0.12]]] + ... ) + >>> perplexity_loss(y_true, y_pred) + Traceback (most recent call last): + ... + ValueError: Label value must not be greater than vocabulary size. + >>> y_true = np.array([[1, 4]]) + >>> y_pred = np.array( + ... [[[0.28, 0.19, 0.21 , 0.15, 0.15], + ... [0.24, 0.19, 0.09, 0.18, 0.27]], + ... [[0.03, 0.26, 0.21, 0.18, 0.30], + ... [0.28, 0.10, 0.33, 0.15, 0.12]]] + ... ) + >>> perplexity_loss(y_true, y_pred) + Traceback (most recent call last): + ... + ValueError: Batch size of y_true and y_pred must be equal. + """ + + vocab_size = y_pred.shape[2] + + if y_true.shape[0] != y_pred.shape[0]: + raise ValueError("Batch size of y_true and y_pred must be equal.") + if y_true.shape[1] != y_pred.shape[1]: + raise ValueError("Sentence length of y_true and y_pred must be equal.") + if np.max(y_true) > vocab_size: + raise ValueError("Label value must not be greater than vocabulary size.") + + # Matrix to select prediction value only for true class + filter_matrix = np.array( + [[list(np.eye(vocab_size)[word]) for word in sentence] for sentence in y_true] + ) + + # Getting the matrix containing prediction for only true class + true_class_pred = np.sum(y_pred * filter_matrix, axis=2).clip(epsilon, 1) + + # Calculating perplexity for each sentence + perp_losses = np.exp(np.negative(np.mean(np.log(true_class_pred), axis=1))) + + return np.mean(perp_losses) + + if __name__ == "__main__": import doctest diff --git a/maths/numerical_analysis/runge_kutta_gills.py b/maths/numerical_analysis/runge_kutta_gills.py new file mode 100644 index 000000000..2bd9cd612 --- /dev/null +++ b/maths/numerical_analysis/runge_kutta_gills.py @@ -0,0 +1,89 @@ +""" +Use the Runge-Kutta-Gill's method of order 4 to solve Ordinary Differential Equations. + +https://www.geeksforgeeks.org/gills-4th-order-method-to-solve-differential-equations/ +Author : Ravi Kumar +""" +from collections.abc import Callable +from math import sqrt + +import numpy as np + + +def runge_kutta_gills( + func: Callable[[float, float], float], + x_initial: float, + y_initial: float, + step_size: float, + x_final: float, +) -> np.ndarray: + """ + Solve an Ordinary Differential Equations using Runge-Kutta-Gills Method of order 4. + + args: + func: An ordinary differential equation (ODE) as function of x and y. + x_initial: The initial value of x. + y_initial: The initial value of y. + step_size: The increment value of x. + x_final: The final value of x. + + Returns: + Solution of y at each nodal point + + >>> def f(x, y): + ... return (x-y)/2 + >>> y = runge_kutta_gills(f, 0, 3, 0.2, 5) + >>> y[-1] + 3.4104259225717537 + + >>> def f(x,y): + ... return x + >>> y = runge_kutta_gills(f, -1, 0, 0.2, 0) + >>> y + array([ 0. , -0.18, -0.32, -0.42, -0.48, -0.5 ]) + + >>> def f(x, y): + ... return x + y + >>> y = runge_kutta_gills(f, 0, 0, 0.2, -1) + Traceback (most recent call last): + ... + ValueError: The final value of x must be greater than initial value of x. + + >>> def f(x, y): + ... return x + >>> y = runge_kutta_gills(f, -1, 0, -0.2, 0) + Traceback (most recent call last): + ... + ValueError: Step size must be positive. + """ + if x_initial >= x_final: + raise ValueError( + "The final value of x must be greater than initial value of x." + ) + + if step_size <= 0: + raise ValueError("Step size must be positive.") + + n = int((x_final - x_initial) / step_size) + y = np.zeros(n + 1) + y[0] = y_initial + for i in range(n): + k1 = step_size * func(x_initial, y[i]) + k2 = step_size * func(x_initial + step_size / 2, y[i] + k1 / 2) + k3 = step_size * func( + x_initial + step_size / 2, + y[i] + (-0.5 + 1 / sqrt(2)) * k1 + (1 - 1 / sqrt(2)) * k2, + ) + k4 = step_size * func( + x_initial + step_size, y[i] - (1 / sqrt(2)) * k2 + (1 + 1 / sqrt(2)) * k3 + ) + + y[i + 1] = y[i] + (k1 + (2 - sqrt(2)) * k2 + (2 + sqrt(2)) * k3 + k4) / 6 + x_initial += step_size + return y + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/maths/prime_check.py b/maths/prime_check.py index c17877a57..f1bc4def2 100644 --- a/maths/prime_check.py +++ b/maths/prime_check.py @@ -29,12 +29,19 @@ def is_prime(number: int) -> bool: True >>> is_prime(67483) False + >>> is_prime(16.1) + Traceback (most recent call last): + ... + ValueError: is_prime() only accepts positive integers + >>> is_prime(-4) + Traceback (most recent call last): + ... + ValueError: is_prime() only accepts positive integers """ # precondition - assert isinstance(number, int) and ( - number >= 0 - ), "'number' must been an int and positive" + if not isinstance(number, int) or not number >= 0: + raise ValueError("is_prime() only accepts positive integers") if 1 < number < 4: # 2 and 3 are primes @@ -64,7 +71,7 @@ class Test(unittest.TestCase): assert is_prime(29) def test_not_primes(self): - with pytest.raises(AssertionError): + with pytest.raises(ValueError): is_prime(-19) assert not is_prime( 0 diff --git a/maths/primelib.py b/maths/primelib.py index e2d432e18..a26b0eaeb 100644 --- a/maths/primelib.py +++ b/maths/primelib.py @@ -454,6 +454,8 @@ def kg_v(number1, number2): 40 >>> kg_v(824,67) 55208 + >>> kg_v(1, 10) + 10 >>> kg_v(0) Traceback (most recent call last): ... diff --git a/networking_flow/ford_fulkerson.py b/networking_flow/ford_fulkerson.py index 716ed508e..7d5fb522e 100644 --- a/networking_flow/ford_fulkerson.py +++ b/networking_flow/ford_fulkerson.py @@ -1,39 +1,95 @@ -# Ford-Fulkerson Algorithm for Maximum Flow Problem """ +Ford-Fulkerson Algorithm for Maximum Flow Problem +* https://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm + Description: - (1) Start with initial flow as 0; - (2) Choose augmenting path from source to sink and add path to flow; + (1) Start with initial flow as 0 + (2) Choose the augmenting path from source to sink and add the path to flow """ +graph = [ + [0, 16, 13, 0, 0, 0], + [0, 0, 10, 12, 0, 0], + [0, 4, 0, 0, 14, 0], + [0, 0, 9, 0, 0, 20], + [0, 0, 0, 7, 0, 4], + [0, 0, 0, 0, 0, 0], +] -def bfs(graph, s, t, parent): - # Return True if there is node that has not iterated. - visited = [False] * len(graph) - queue = [] - queue.append(s) - visited[s] = True +def breadth_first_search(graph: list, source: int, sink: int, parents: list) -> bool: + """ + This function returns True if there is a node that has not iterated. + + Args: + graph: Adjacency matrix of graph + source: Source + sink: Sink + parents: Parent list + + Returns: + True if there is a node that has not iterated. + + >>> breadth_first_search(graph, 0, 5, [-1, -1, -1, -1, -1, -1]) + True + >>> breadth_first_search(graph, 0, 6, [-1, -1, -1, -1, -1, -1]) + Traceback (most recent call last): + ... + IndexError: list index out of range + """ + visited = [False] * len(graph) # Mark all nodes as not visited + queue = [] # breadth-first search queue + + # Source node + queue.append(source) + visited[source] = True while queue: - u = queue.pop(0) - for ind in range(len(graph[u])): - if visited[ind] is False and graph[u][ind] > 0: + u = queue.pop(0) # Pop the front node + # Traverse all adjacent nodes of u + for ind, node in enumerate(graph[u]): + if visited[ind] is False and node > 0: queue.append(ind) visited[ind] = True - parent[ind] = u - - return visited[t] + parents[ind] = u + return visited[sink] -def ford_fulkerson(graph, source, sink): - # This array is filled by BFS and to store path +def ford_fulkerson(graph: list, source: int, sink: int) -> int: + """ + This function returns the maximum flow from source to sink in the given graph. + + CAUTION: This function changes the given graph. + + Args: + graph: Adjacency matrix of graph + source: Source + sink: Sink + + Returns: + Maximum flow + + >>> test_graph = [ + ... [0, 16, 13, 0, 0, 0], + ... [0, 0, 10, 12, 0, 0], + ... [0, 4, 0, 0, 14, 0], + ... [0, 0, 9, 0, 0, 20], + ... [0, 0, 0, 7, 0, 4], + ... [0, 0, 0, 0, 0, 0], + ... ] + >>> ford_fulkerson(test_graph, 0, 5) + 23 + """ + # This array is filled by breadth-first search and to store path parent = [-1] * (len(graph)) max_flow = 0 - while bfs(graph, source, sink, parent): - path_flow = float("Inf") + + # While there is a path from source to sink + while breadth_first_search(graph, source, sink, parent): + path_flow = int(1e9) # Infinite value s = sink while s != source: - # Find the minimum value in select path + # Find the minimum value in the selected path path_flow = min(path_flow, graph[parent[s]][s]) s = parent[s] @@ -45,17 +101,12 @@ def ford_fulkerson(graph, source, sink): graph[u][v] -= path_flow graph[v][u] += path_flow v = parent[v] + return max_flow -graph = [ - [0, 16, 13, 0, 0, 0], - [0, 0, 10, 12, 0, 0], - [0, 4, 0, 0, 14, 0], - [0, 0, 9, 0, 0, 20], - [0, 0, 0, 7, 0, 4], - [0, 0, 0, 0, 0, 0], -] +if __name__ == "__main__": + from doctest import testmod -source, sink = 0, 5 -print(ford_fulkerson(graph, source, sink)) + testmod() + print(f"{ford_fulkerson(graph, source=0, sink=5) = }") diff --git a/other/dijkstra_bankers_algorithm.py b/other/bankers_algorithm.py similarity index 99% rename from other/dijkstra_bankers_algorithm.py rename to other/bankers_algorithm.py index be7bceba1..858eb0b2c 100644 --- a/other/dijkstra_bankers_algorithm.py +++ b/other/bankers_algorithm.py @@ -17,8 +17,6 @@ before deciding whether allocation should be allowed to continue. from __future__ import annotations -import time - import numpy as np test_claim_vector = [8, 5, 9, 7] @@ -216,7 +214,6 @@ class BankersAlgorithm: "Initial Available Resources: " + " ".join(str(x) for x in self.__available_resources()) ) - time.sleep(1) if __name__ == "__main__": diff --git a/physics/lens_formulae.py b/physics/lens_formulae.py new file mode 100644 index 000000000..162f3a8f3 --- /dev/null +++ b/physics/lens_formulae.py @@ -0,0 +1,131 @@ +""" +This module has functions which calculate focal length of lens, distance of +image from the lens and distance of object from the lens. +The above is calculated using the lens formula. + +In optics, the relationship between the distance of the image (v), +the distance of the object (u), and +the focal length (f) of the lens is given by the formula known as the Lens formula. +The Lens formula is applicable for convex as well as concave lenses. The formula +is given as follows: + +------------------- +| 1/f = 1/v + 1/u | +------------------- + +Where + f = focal length of the lens in meters. + v = distance of the image from the lens in meters. + u = distance of the object from the lens in meters. + +To make our calculations easy few assumptions are made while deriving the formula +which are important to keep in mind before solving this equation. +The assumptions are as follows: + 1. The object O is a point object lying somewhere on the principle axis. + 2. The lens is thin. + 3. The aperture of the lens taken must be small. + 4. The angles of incidence and angle of refraction should be small. + +Sign convention is a set of rules to set signs for image distance, object distance, +focal length, etc +for mathematical analysis of image formation. According to it: + 1. Object is always placed to the left of lens. + 2. All distances are measured from the optical centre of the mirror. + 3. Distances measured in the direction of the incident ray are positive and + the distances measured in the direction opposite + to that of the incident rays are negative. + 4. Distances measured along y-axis above the principal axis are positive and + that measured along y-axis below the principal + axis are negative. + +Note: Sign convention can be reversed and will still give the correct results. + +Reference for Sign convention: +https://www.toppr.com/ask/content/concept/sign-convention-for-lenses-210246/ + +Reference for assumptions: +https://testbook.com/physics/derivation-of-lens-maker-formula +""" + + +def focal_length_of_lens( + object_distance_from_lens: float, image_distance_from_lens: float +) -> float: + """ + Doctests: + >>> from math import isclose + >>> isclose(focal_length_of_lens(10,4), 6.666666666666667) + True + >>> from math import isclose + >>> isclose(focal_length_of_lens(2.7,5.8), -5.0516129032258075) + True + >>> focal_length_of_lens(0, 20) # doctest: +NORMALIZE_WHITESPACE + Traceback (most recent call last): + ... + ValueError: Invalid inputs. Enter non zero values with respect + to the sign convention. + """ + + if object_distance_from_lens == 0 or image_distance_from_lens == 0: + raise ValueError( + "Invalid inputs. Enter non zero values with respect to the sign convention." + ) + focal_length = 1 / ( + (1 / image_distance_from_lens) - (1 / object_distance_from_lens) + ) + return focal_length + + +def object_distance( + focal_length_of_lens: float, image_distance_from_lens: float +) -> float: + """ + Doctests: + >>> from math import isclose + >>> isclose(object_distance(10,40), -13.333333333333332) + True + + >>> from math import isclose + >>> isclose(object_distance(6.2,1.5), 1.9787234042553192) + True + + >>> object_distance(0, 20) # doctest: +NORMALIZE_WHITESPACE + Traceback (most recent call last): + ... + ValueError: Invalid inputs. Enter non zero values with respect + to the sign convention. + """ + + if image_distance_from_lens == 0 or focal_length_of_lens == 0: + raise ValueError( + "Invalid inputs. Enter non zero values with respect to the sign convention." + ) + + object_distance = 1 / ((1 / image_distance_from_lens) - (1 / focal_length_of_lens)) + return object_distance + + +def image_distance( + focal_length_of_lens: float, object_distance_from_lens: float +) -> float: + """ + Doctests: + >>> from math import isclose + >>> isclose(image_distance(50,40), 22.22222222222222) + True + >>> from math import isclose + >>> isclose(image_distance(5.3,7.9), 3.1719696969696973) + True + + >>> object_distance(0, 20) # doctest: +NORMALIZE_WHITESPACE + Traceback (most recent call last): + ... + ValueError: Invalid inputs. Enter non zero values with respect + to the sign convention. + """ + if object_distance_from_lens == 0 or focal_length_of_lens == 0: + raise ValueError( + "Invalid inputs. Enter non zero values with respect to the sign convention." + ) + image_distance = 1 / ((1 / object_distance_from_lens) + (1 / focal_length_of_lens)) + return image_distance diff --git a/requirements.txt b/requirements.txt index 05d9f1e8c..8937f6bb0 100644 --- a/requirements.txt +++ b/requirements.txt @@ -19,5 +19,6 @@ statsmodels sympy tensorflow ; python_version < '3.12' tweepy -xgboost # yulewalker # uncomment once audio_filters/equal_loudness_filter.py is fixed +typing_extensions +xgboost diff --git a/sorts/binary_insertion_sort.py b/sorts/binary_insertion_sort.py index 8d4102558..50653a99e 100644 --- a/sorts/binary_insertion_sort.py +++ b/sorts/binary_insertion_sort.py @@ -12,10 +12,11 @@ python binary_insertion_sort.py def binary_insertion_sort(collection: list) -> list: - """Pure implementation of the binary insertion sort algorithm in Python - :param collection: some mutable ordered collection with heterogeneous - comparable items inside - :return: the same collection ordered by ascending + """ + Sorts a list using the binary insertion sort algorithm. + + :param collection: A mutable ordered collection with comparable items. + :return: The same collection ordered in ascending order. Examples: >>> binary_insertion_sort([0, 4, 1234, 4, 1]) @@ -39,23 +40,27 @@ def binary_insertion_sort(collection: list) -> list: n = len(collection) for i in range(1, n): - val = collection[i] + value_to_insert = collection[i] low = 0 high = i - 1 while low <= high: mid = (low + high) // 2 - if val < collection[mid]: + if value_to_insert < collection[mid]: high = mid - 1 else: low = mid + 1 for j in range(i, low, -1): collection[j] = collection[j - 1] - collection[low] = val + collection[low] = value_to_insert return collection -if __name__ == "__main__": +if __name__ == "__main": user_input = input("Enter numbers separated by a comma:\n").strip() - unsorted = [int(item) for item in user_input.split(",")] - print(binary_insertion_sort(unsorted)) + try: + unsorted = [int(item) for item in user_input.split(",")] + except ValueError: + print("Invalid input. Please enter valid integers separated by commas.") + raise + print(f"{binary_insertion_sort(unsorted) = }") diff --git a/sorts/cocktail_shaker_sort.py b/sorts/cocktail_shaker_sort.py index b738ff31d..de126426d 100644 --- a/sorts/cocktail_shaker_sort.py +++ b/sorts/cocktail_shaker_sort.py @@ -1,40 +1,62 @@ -""" https://en.wikipedia.org/wiki/Cocktail_shaker_sort """ +""" +An implementation of the cocktail shaker sort algorithm in pure Python. + +https://en.wikipedia.org/wiki/Cocktail_shaker_sort +""" -def cocktail_shaker_sort(unsorted: list) -> list: +def cocktail_shaker_sort(arr: list[int]) -> list[int]: """ - Pure implementation of the cocktail shaker sort algorithm in Python. + Sorts a list using the Cocktail Shaker Sort algorithm. + + :param arr: List of elements to be sorted. + :return: Sorted list. + >>> cocktail_shaker_sort([4, 5, 2, 1, 2]) [1, 2, 2, 4, 5] - >>> cocktail_shaker_sort([-4, 5, 0, 1, 2, 11]) [-4, 0, 1, 2, 5, 11] - >>> cocktail_shaker_sort([0.1, -2.4, 4.4, 2.2]) [-2.4, 0.1, 2.2, 4.4] - >>> cocktail_shaker_sort([1, 2, 3, 4, 5]) [1, 2, 3, 4, 5] - >>> cocktail_shaker_sort([-4, -5, -24, -7, -11]) [-24, -11, -7, -5, -4] + >>> cocktail_shaker_sort(["elderberry", "banana", "date", "apple", "cherry"]) + ['apple', 'banana', 'cherry', 'date', 'elderberry'] + >>> cocktail_shaker_sort((-4, -5, -24, -7, -11)) + Traceback (most recent call last): + ... + TypeError: 'tuple' object does not support item assignment """ - for i in range(len(unsorted) - 1, 0, -1): + start, end = 0, len(arr) - 1 + + while start < end: swapped = False - for j in range(i, 0, -1): - if unsorted[j] < unsorted[j - 1]: - unsorted[j], unsorted[j - 1] = unsorted[j - 1], unsorted[j] - swapped = True - - for j in range(i): - if unsorted[j] > unsorted[j + 1]: - unsorted[j], unsorted[j + 1] = unsorted[j + 1], unsorted[j] + # Pass from left to right + for i in range(start, end): + if arr[i] > arr[i + 1]: + arr[i], arr[i + 1] = arr[i + 1], arr[i] swapped = True if not swapped: break - return unsorted + + end -= 1 # Decrease the end pointer after each pass + + # Pass from right to left + for i in range(end, start, -1): + if arr[i] < arr[i - 1]: + arr[i], arr[i - 1] = arr[i - 1], arr[i] + swapped = True + + if not swapped: + break + + start += 1 # Increase the start pointer after each pass + + return arr if __name__ == "__main__": diff --git a/sorts/merge_sort.py b/sorts/merge_sort.py index e80b1cb22..0628b848b 100644 --- a/sorts/merge_sort.py +++ b/sorts/merge_sort.py @@ -12,9 +12,13 @@ python merge_sort.py def merge_sort(collection: list) -> list: """ - :param collection: some mutable ordered collection with heterogeneous - comparable items inside - :return: the same collection ordered by ascending + Sorts a list using the merge sort algorithm. + + :param collection: A mutable ordered collection with comparable items. + :return: The same collection ordered in ascending order. + + Time Complexity: O(n log n) + Examples: >>> merge_sort([0, 5, 3, 2, 2]) [0, 2, 2, 3, 5] @@ -26,31 +30,34 @@ def merge_sort(collection: list) -> list: def merge(left: list, right: list) -> list: """ - Merge left and right. + Merge two sorted lists into a single sorted list. - :param left: left collection - :param right: right collection - :return: merge result + :param left: Left collection + :param right: Right collection + :return: Merged result """ - - def _merge(): - while left and right: - yield (left if left[0] <= right[0] else right).pop(0) - yield from left - yield from right - - return list(_merge()) + result = [] + while left and right: + result.append(left.pop(0) if left[0] <= right[0] else right.pop(0)) + result.extend(left) + result.extend(right) + return result if len(collection) <= 1: return collection - mid = len(collection) // 2 - return merge(merge_sort(collection[:mid]), merge_sort(collection[mid:])) + mid_index = len(collection) // 2 + return merge(merge_sort(collection[:mid_index]), merge_sort(collection[mid_index:])) if __name__ == "__main__": import doctest doctest.testmod() - user_input = input("Enter numbers separated by a comma:\n").strip() - unsorted = [int(item) for item in user_input.split(",")] - print(*merge_sort(unsorted), sep=",") + + try: + user_input = input("Enter numbers separated by a comma:\n").strip() + unsorted = [int(item) for item in user_input.split(",")] + sorted_list = merge_sort(unsorted) + print(*sorted_list, sep=",") + except ValueError: + print("Invalid input. Please enter valid integers separated by commas.") diff --git a/strings/capitalize.py b/strings/capitalize.py index e7e97c2be..c0b45e0d9 100644 --- a/strings/capitalize.py +++ b/strings/capitalize.py @@ -3,7 +3,8 @@ from string import ascii_lowercase, ascii_uppercase def capitalize(sentence: str) -> str: """ - This function will capitalize the first letter of a sentence or a word + Capitalizes the first letter of a sentence or word. + >>> capitalize("hello world") 'Hello world' >>> capitalize("123 hello world") @@ -17,6 +18,10 @@ def capitalize(sentence: str) -> str: """ if not sentence: return "" + + # Create a dictionary that maps lowercase letters to uppercase letters + # Capitalize the first character if it's a lowercase letter + # Concatenate the capitalized character with the rest of the string lower_to_upper = dict(zip(ascii_lowercase, ascii_uppercase)) return lower_to_upper.get(sentence[0], sentence[0]) + sentence[1:] diff --git a/strings/frequency_finder.py b/strings/frequency_finder.py index 19f97afbb..8479c81ae 100644 --- a/strings/frequency_finder.py +++ b/strings/frequency_finder.py @@ -49,6 +49,15 @@ def get_item_at_index_zero(x: tuple) -> str: def get_frequency_order(message: str) -> str: + """ + Get the frequency order of the letters in the given string + >>> get_frequency_order('Hello World') + 'LOWDRHEZQXJKVBPYGFMUCSNIAT' + >>> get_frequency_order('Hello@') + 'LHOEZQXJKVBPYGFWMUCDRSNIAT' + >>> get_frequency_order('h') + 'HZQXJKVBPYGFWMUCLDRSNIOATE' + """ letter_to_freq = get_letter_count(message) freq_to_letter: dict[int, list[str]] = { freq: [] for letter, freq in letter_to_freq.items()