Revamp md5.py (#8065 )

* Add type hints to md5.py * Rename some vars to snake case * Specify functions imported from math * Rename vars and functions to be more descriptive * Make tests from test function into doctests * Clarify more var names * Refactor some MD5 code into preprocess function * Simplify loop indices in get_block_words * Add more detailed comments, docs, and doctests * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * Add type hints to md5.py * Rename some vars to snake case * Specify functions imported from math * Rename vars and functions to be more descriptive * Make tests from test function into doctests * Clarify more var names * Refactor some MD5 code into preprocess function * Simplify loop indices in get_block_words * Add more detailed comments, docs, and doctests * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * Convert str types to bytes * Add tests comparing md5_me to hashlib's md5 * Replace line-break backslashes with parentheses --------- Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com>
Reenable files when TensorFlow supports the current Python (#8602 )
2025-02-24 01:48:39 +00:00 · 2023-04-01 22:05:01 +02:00 · 2023-04-01 19:43:11 +02:00 · 2023-04-01 18:43:07 +02:00 · 2023-04-01 18:39:22 +02:00 · 2023-04-01 18:22:33 +02:00
8 changed files with 646 additions and 210 deletions
--- a/DIRECTORY.md
+++ b/DIRECTORY.md
@ -309,6 +309,7 @@
  * [Floyd Warshall](dynamic_programming/floyd_warshall.py)
  * [Integer Partition](dynamic_programming/integer_partition.py)
  * [Iterating Through Submasks](dynamic_programming/iterating_through_submasks.py)
+  * [K Means Clustering Tensorflow](dynamic_programming/k_means_clustering_tensorflow.py)
  * [Knapsack](dynamic_programming/knapsack.py)
  * [Longest Common Subsequence](dynamic_programming/longest_common_subsequence.py)
  * [Longest Common Substring](dynamic_programming/longest_common_substring.py)
@ -685,6 +686,7 @@
  * [2 Hidden Layers Neural Network](neural_network/2_hidden_layers_neural_network.py)
  * [Back Propagation Neural Network](neural_network/back_propagation_neural_network.py)
  * [Convolution Neural Network](neural_network/convolution_neural_network.py)
+  * [Input Data](neural_network/input_data.py)
  * [Perceptron](neural_network/perceptron.py)
  * [Simple Neural Network](neural_network/simple_neural_network.py)

@ -715,6 +717,7 @@
  * [Archimedes Principle](physics/archimedes_principle.py)
  * [Casimir Effect](physics/casimir_effect.py)
  * [Centripetal Force](physics/centripetal_force.py)
+  * [Grahams Law](physics/grahams_law.py)
  * [Horizontal Projectile Motion](physics/horizontal_projectile_motion.py)
  * [Hubble Parameter](physics/hubble_parameter.py)
  * [Ideal Gas Law](physics/ideal_gas_law.py)
--- a/digital_image_processing/edge_detection/canny.py
+++ b/digital_image_processing/edge_detection/canny.py
@ -18,105 +18,126 @@ def gen_gaussian_kernel(k_size, sigma):
    return g


-def canny(image, threshold_low=15, threshold_high=30, weak=128, strong=255):
-    image_row, image_col = image.shape[0], image.shape[1]
-    # gaussian_filter
-    gaussian_out = img_convolve(image, gen_gaussian_kernel(9, sigma=1.4))
-    # get the gradient and degree by sobel_filter
-    sobel_grad, sobel_theta = sobel_filter(gaussian_out)
-    gradient_direction = np.rad2deg(sobel_theta)
-    gradient_direction += PI
-
-    dst = np.zeros((image_row, image_col))
-
+def suppress_non_maximum(image_shape, gradient_direction, sobel_grad):
    """
    Non-maximum suppression. If the edge strength of the current pixel is the largest
    compared to the other pixels in the mask with the same direction, the value will be
    preserved. Otherwise, the value will be suppressed.
    """
-    for row in range(1, image_row - 1):
-        for col in range(1, image_col - 1):
+    destination = np.zeros(image_shape)
+
+    for row in range(1, image_shape[0] - 1):
+        for col in range(1, image_shape[1] - 1):
            direction = gradient_direction[row, col]

            if (
-                0 <= direction < 22.5
+                0 <= direction < PI / 8
                or 15 * PI / 8 <= direction <= 2 * PI
                or 7 * PI / 8 <= direction <= 9 * PI / 8
            ):
                w = sobel_grad[row, col - 1]
                e = sobel_grad[row, col + 1]
                if sobel_grad[row, col] >= w and sobel_grad[row, col] >= e:
-                    dst[row, col] = sobel_grad[row, col]
+                    destination[row, col] = sobel_grad[row, col]

-            elif (PI / 8 <= direction < 3 * PI / 8) or (
-                9 * PI / 8 <= direction < 11 * PI / 8
+            elif (
+                PI / 8 <= direction < 3 * PI / 8
+                or 9 * PI / 8 <= direction < 11 * PI / 8
            ):
                sw = sobel_grad[row + 1, col - 1]
                ne = sobel_grad[row - 1, col + 1]
                if sobel_grad[row, col] >= sw and sobel_grad[row, col] >= ne:
-                    dst[row, col] = sobel_grad[row, col]
+                    destination[row, col] = sobel_grad[row, col]

-            elif (3 * PI / 8 <= direction < 5 * PI / 8) or (
-                11 * PI / 8 <= direction < 13 * PI / 8
+            elif (
+                3 * PI / 8 <= direction < 5 * PI / 8
+                or 11 * PI / 8 <= direction < 13 * PI / 8
            ):
                n = sobel_grad[row - 1, col]
                s = sobel_grad[row + 1, col]
                if sobel_grad[row, col] >= n and sobel_grad[row, col] >= s:
-                    dst[row, col] = sobel_grad[row, col]
+                    destination[row, col] = sobel_grad[row, col]

-            elif (5 * PI / 8 <= direction < 7 * PI / 8) or (
-                13 * PI / 8 <= direction < 15 * PI / 8
+            elif (
+                5 * PI / 8 <= direction < 7 * PI / 8
+                or 13 * PI / 8 <= direction < 15 * PI / 8
            ):
                nw = sobel_grad[row - 1, col - 1]
                se = sobel_grad[row + 1, col + 1]
                if sobel_grad[row, col] >= nw and sobel_grad[row, col] >= se:
-                    dst[row, col] = sobel_grad[row, col]
+                    destination[row, col] = sobel_grad[row, col]

-            """
-            High-Low threshold detection. If an edge pixel’s gradient value is higher
-            than the high threshold value, it is marked as a strong edge pixel. If an
-            edge pixel’s gradient value is smaller than the high threshold value and
-            larger than the low threshold value, it is marked as a weak edge pixel. If
-            an edge pixel's value is smaller than the low threshold value, it will be
-            suppressed.
-            """
-            if dst[row, col] >= threshold_high:
-                dst[row, col] = strong
-            elif dst[row, col] <= threshold_low:
-                dst[row, col] = 0
+    return destination
+
+
+def detect_high_low_threshold(
+    image_shape, destination, threshold_low, threshold_high, weak, strong
+):
+    """
+    High-Low threshold detection. If an edge pixel’s gradient value is higher
+    than the high threshold value, it is marked as a strong edge pixel. If an
+    edge pixel’s gradient value is smaller than the high threshold value and
+    larger than the low threshold value, it is marked as a weak edge pixel. If
+    an edge pixel's value is smaller than the low threshold value, it will be
+    suppressed.
+    """
+    for row in range(1, image_shape[0] - 1):
+        for col in range(1, image_shape[1] - 1):
+            if destination[row, col] >= threshold_high:
+                destination[row, col] = strong
+            elif destination[row, col] <= threshold_low:
+                destination[row, col] = 0
            else:
-                dst[row, col] = weak
+                destination[row, col] = weak

+
+def track_edge(image_shape, destination, weak, strong):
    """
    Edge tracking. Usually a weak edge pixel caused from true edges will be connected
    to a strong edge pixel while noise responses are unconnected. As long as there is
    one strong edge pixel that is involved in its 8-connected neighborhood, that weak
    edge point can be identified as one that should be preserved.
    """
-    for row in range(1, image_row):
-        for col in range(1, image_col):
-            if dst[row, col] == weak:
+    for row in range(1, image_shape[0]):
+        for col in range(1, image_shape[1]):
+            if destination[row, col] == weak:
                if 255 in (
-                    dst[row, col + 1],
-                    dst[row, col - 1],
-                    dst[row - 1, col],
-                    dst[row + 1, col],
-                    dst[row - 1, col - 1],
-                    dst[row + 1, col - 1],
-                    dst[row - 1, col + 1],
-                    dst[row + 1, col + 1],
+                    destination[row, col + 1],
+                    destination[row, col - 1],
+                    destination[row - 1, col],
+                    destination[row + 1, col],
+                    destination[row - 1, col - 1],
+                    destination[row + 1, col - 1],
+                    destination[row - 1, col + 1],
+                    destination[row + 1, col + 1],
                ):
-                    dst[row, col] = strong
+                    destination[row, col] = strong
                else:
-                    dst[row, col] = 0
+                    destination[row, col] = 0

-    return dst
+
+def canny(image, threshold_low=15, threshold_high=30, weak=128, strong=255):
+    # gaussian_filter
+    gaussian_out = img_convolve(image, gen_gaussian_kernel(9, sigma=1.4))
+    # get the gradient and degree by sobel_filter
+    sobel_grad, sobel_theta = sobel_filter(gaussian_out)
+    gradient_direction = PI + np.rad2deg(sobel_theta)
+
+    destination = suppress_non_maximum(image.shape, gradient_direction, sobel_grad)
+
+    detect_high_low_threshold(
+        image.shape, destination, threshold_low, threshold_high, weak, strong
+    )
+
+    track_edge(image.shape, destination, weak, strong)
+
+    return destination


 if __name__ == "__main__":
    # read original image in gray mode
    lena = cv2.imread(r"../image_data/lena.jpg", 0)
    # canny edge detection
-    canny_dst = canny(lena)
-    cv2.imshow("canny", canny_dst)
+    canny_destination = canny(lena)
+    cv2.imshow("canny", canny_destination)
    cv2.waitKey(0)
--- a/digital_image_processing/morphological_operations/dilation_operation.py
+++ b/digital_image_processing/morphological_operations/dilation_operation.py
@ -1,33 +1,35 @@
+from pathlib import Path
+
 import numpy as np
 from PIL import Image


-def rgb2gray(rgb: np.array) -> np.array:
+def rgb_to_gray(rgb: np.ndarray) -> np.ndarray:
    """
    Return gray image from rgb image
-    >>> rgb2gray(np.array([[[127, 255, 0]]]))
+    >>> rgb_to_gray(np.array([[[127, 255, 0]]]))
    array([[187.6453]])
-    >>> rgb2gray(np.array([[[0, 0, 0]]]))
+    >>> rgb_to_gray(np.array([[[0, 0, 0]]]))
    array([[0.]])
-    >>> rgb2gray(np.array([[[2, 4, 1]]]))
+    >>> rgb_to_gray(np.array([[[2, 4, 1]]]))
    array([[3.0598]])
-    >>> rgb2gray(np.array([[[26, 255, 14], [5, 147, 20], [1, 200, 0]]]))
+    >>> rgb_to_gray(np.array([[[26, 255, 14], [5, 147, 20], [1, 200, 0]]]))
    array([[159.0524,  90.0635, 117.6989]])
    """
    r, g, b = rgb[:, :, 0], rgb[:, :, 1], rgb[:, :, 2]
    return 0.2989 * r + 0.5870 * g + 0.1140 * b


-def gray2binary(gray: np.array) -> np.array:
+def gray_to_binary(gray: np.ndarray) -> np.ndarray:
    """
    Return binary image from gray image
-    >>> gray2binary(np.array([[127, 255, 0]]))
+    >>> gray_to_binary(np.array([[127, 255, 0]]))
    array([[False,  True, False]])
-    >>> gray2binary(np.array([[0]]))
+    >>> gray_to_binary(np.array([[0]]))
    array([[False]])
-    >>> gray2binary(np.array([[26.2409, 4.9315, 1.4729]]))
+    >>> gray_to_binary(np.array([[26.2409, 4.9315, 1.4729]]))
    array([[False, False, False]])
-    >>> gray2binary(np.array([[26, 255, 14], [5, 147, 20], [1, 200, 0]]))
+    >>> gray_to_binary(np.array([[26, 255, 14], [5, 147, 20], [1, 200, 0]]))
    array([[False,  True, False],
           [False,  True, False],
           [False,  True, False]])
@ -35,7 +37,7 @@ def gray2binary(gray: np.array) -> np.array:
    return (gray > 127) & (gray <= 255)


-def dilation(image: np.array, kernel: np.array) -> np.array:
+def dilation(image: np.ndarray, kernel: np.ndarray) -> np.ndarray:
    """
    Return dilated image
    >>> dilation(np.array([[True, False, True]]), np.array([[0, 1, 0]]))
@ -61,14 +63,13 @@ def dilation(image: np.array, kernel: np.array) -> np.array:
    return output


-# kernel to be applied
-structuring_element = np.array([[0, 1, 0], [1, 1, 1], [0, 1, 0]])
-
-
 if __name__ == "__main__":
    # read original image
-    image = np.array(Image.open(r"..\image_data\lena.jpg"))
-    output = dilation(gray2binary(rgb2gray(image)), structuring_element)
+    lena_path = Path(__file__).resolve().parent / "image_data" / "lena.jpg"
+    lena = np.array(Image.open(lena_path))
+    # kernel to be applied
+    structuring_element = np.array([[0, 1, 0], [1, 1, 1], [0, 1, 0]])
+    output = dilation(gray_to_binary(rgb_to_gray(lena)), structuring_element)
    # Save the output image
    pil_img = Image.fromarray(output).convert("RGB")
    pil_img.save("result_dilation.png")
--- a/dynamic_programming/k_means_clustering_tensorflow.py_tf
+++ b/dynamic_programming/k_means_clustering_tensorflow.py_tf
@ -1,9 +1,10 @@
-import tensorflow as tf
 from random import shuffle
+
+import tensorflow as tf
 from numpy import array


-def TFKMeansCluster(vectors, noofclusters):
+def tf_k_means_cluster(vectors, noofclusters):
    """
    K-Means Clustering using TensorFlow.
    'vectors' should be a n*k 2-D NumPy array, where n is the number
@ -30,7 +31,6 @@ def TFKMeansCluster(vectors, noofclusters):
    graph = tf.Graph()

    with graph.as_default():
-
        # SESSION OF COMPUTATION

        sess = tf.Session()
@ -95,8 +95,7 @@ def TFKMeansCluster(vectors, noofclusters):
        # iterations. To keep things simple, we will only do a set number of
        # iterations, instead of using a Stopping Criterion.
        noofiterations = 100
-        for iteration_n in range(noofiterations):
-
+        for _ in range(noofiterations):
            ##EXPECTATION STEP
            ##Based on the centroid locations till last iteration, compute
            ##the _expected_ centroid assignments.
--- a/hashes/md5.py
+++ b/hashes/md5.py
@ -1,91 +1,223 @@
-import math
+"""
+The MD5 algorithm is a hash function that's commonly used as a checksum to
+detect data corruption. The algorithm works by processing a given message in
+blocks of 512 bits, padding the message as needed. It uses the blocks to operate
+a 128-bit state and performs a total of 64 such operations. Note that all values
+are little-endian, so inputs are converted as needed.
+
+Although MD5 was used as a cryptographic hash function in the past, it's since
+been cracked, so it shouldn't be used for security purposes.
+
+For more info, see https://en.wikipedia.org/wiki/MD5
+"""
+
+from collections.abc import Generator
+from math import sin


-def rearrange(bit_string_32):
-    """[summary]
-    Regroups the given binary string.
+def to_little_endian(string_32: bytes) -> bytes:
+    """
+    Converts the given string to little-endian in groups of 8 chars.

    Arguments:
-        bitString32 {[string]} -- [32 bit binary]
+        string_32 {[string]} -- [32-char string]

    Raises:
-    ValueError -- [if the given string not are 32 bit binary string]
+        ValueError -- [input is not 32 char]

    Returns:
-        [string] -- [32 bit binary string]
-    >>> rearrange('1234567890abcdfghijklmnopqrstuvw')
-    'pqrstuvwhijklmno90abcdfg12345678'
+        32-char little-endian string
+    >>> to_little_endian(b'1234567890abcdfghijklmnopqrstuvw')
+    b'pqrstuvwhijklmno90abcdfg12345678'
+    >>> to_little_endian(b'1234567890')
+    Traceback (most recent call last):
+    ...
+    ValueError: Input must be of length 32
    """
+    if len(string_32) != 32:
+        raise ValueError("Input must be of length 32")

-    if len(bit_string_32) != 32:
-        raise ValueError("Need length 32")
-    new_string = ""
+    little_endian = b""
    for i in [3, 2, 1, 0]:
-        new_string += bit_string_32[8 * i : 8 * i + 8]
-    return new_string
+        little_endian += string_32[8 * i : 8 * i + 8]
+    return little_endian


-def reformat_hex(i):
-    """[summary]
-    Converts the given integer into 8-digit hex number.
+def reformat_hex(i: int) -> bytes:
+    """
+    Converts the given non-negative integer to hex string.
+
+    Example: Suppose the input is the following:
+        i = 1234
+
+        The input is 0x000004d2 in hex, so the little-endian hex string is
+        "d2040000".

    Arguments:
-            i {[int]} -- [integer]
+        i {[int]} -- [integer]
+
+    Raises:
+        ValueError -- [input is negative]
+
+    Returns:
+        8-char little-endian hex string
+
+    >>> reformat_hex(1234)
+    b'd2040000'
    >>> reformat_hex(666)
-    '9a020000'
+    b'9a020000'
+    >>> reformat_hex(0)
+    b'00000000'
+    >>> reformat_hex(1234567890)
+    b'd2029649'
+    >>> reformat_hex(1234567890987654321)
+    b'b11c6cb1'
+    >>> reformat_hex(-1)
+    Traceback (most recent call last):
+    ...
+    ValueError: Input must be non-negative
    """
+    if i < 0:
+        raise ValueError("Input must be non-negative")

-    hexrep = format(i, "08x")
-    thing = ""
+    hex_rep = format(i, "08x")[-8:]
+    little_endian_hex = b""
    for i in [3, 2, 1, 0]:
-        thing += hexrep[2 * i : 2 * i + 2]
-    return thing
+        little_endian_hex += hex_rep[2 * i : 2 * i + 2].encode("utf-8")
+    return little_endian_hex


-def pad(bit_string):
-    """[summary]
-    Fills up the binary string to a 512 bit binary string
+def preprocess(message: bytes) -> bytes:
+    """
+    Preprocesses the message string:
+    - Convert message to bit string
+    - Pad bit string to a multiple of 512 chars:
+        - Append a 1
+        - Append 0's until length = 448 (mod 512)
+        - Append length of original message (64 chars)
+
+    Example: Suppose the input is the following:
+        message = "a"
+
+        The message bit string is "01100001", which is 8 bits long. Thus, the
+        bit string needs 439 bits of padding so that
+        (bit_string + "1" + padding) = 448 (mod 512).
+        The message length is "000010000...0" in 64-bit little-endian binary.
+        The combined bit string is then 512 bits long.

    Arguments:
-            bitString {[string]} -- [binary string]
+        message {[string]} -- [message string]

    Returns:
-            [string] -- [binary string]
+        processed bit string padded to a multiple of 512 chars
+
+    >>> preprocess(b"a") == (b"01100001" + b"1" +
+    ...                     (b"0" * 439) + b"00001000" + (b"0" * 56))
+    True
+    >>> preprocess(b"") == b"1" + (b"0" * 447) + (b"0" * 64)
+    True
    """
-    start_length = len(bit_string)
-    bit_string += "1"
+    bit_string = b""
+    for char in message:
+        bit_string += format(char, "08b").encode("utf-8")
+    start_len = format(len(bit_string), "064b").encode("utf-8")
+
+    # Pad bit_string to a multiple of 512 chars
+    bit_string += b"1"
    while len(bit_string) % 512 != 448:
-        bit_string += "0"
-    last_part = format(start_length, "064b")
-    bit_string += rearrange(last_part[32:]) + rearrange(last_part[:32])
+        bit_string += b"0"
+    bit_string += to_little_endian(start_len[32:]) + to_little_endian(start_len[:32])
+
    return bit_string


-def get_block(bit_string):
-    """[summary]
-    Iterator:
-            Returns by each call a list of length 16 with the 32 bit
-            integer blocks.
+def get_block_words(bit_string: bytes) -> Generator[list[int], None, None]:
+    """
+    Splits bit string into blocks of 512 chars and yields each block as a list
+    of 32-bit words
+
+    Example: Suppose the input is the following:
+        bit_string =
+            "000000000...0" +  # 0x00 (32 bits, padded to the right)
+            "000000010...0" +  # 0x01 (32 bits, padded to the right)
+            "000000100...0" +  # 0x02 (32 bits, padded to the right)
+            "000000110...0" +  # 0x03 (32 bits, padded to the right)
+            ...
+            "000011110...0"    # 0x0a (32 bits, padded to the right)
+
+        Then len(bit_string) == 512, so there'll be 1 block. The block is split
+        into 32-bit words, and each word is converted to little endian. The
+        first word is interpreted as 0 in decimal, the second word is
+        interpreted as 1 in decimal, etc.
+
+        Thus, block_words == [[0, 1, 2, 3, ..., 15]].

    Arguments:
-            bit_string {[string]} -- [binary string >= 512]
+        bit_string {[string]} -- [bit string with multiple of 512 as length]
+
+    Raises:
+        ValueError -- [length of bit string isn't multiple of 512]
+
+    Yields:
+        a list of 16 32-bit words
+
+    >>> test_string = ("".join(format(n << 24, "032b") for n in range(16))
+    ...                  .encode("utf-8"))
+    >>> list(get_block_words(test_string))
+    [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]]
+    >>> list(get_block_words(test_string * 4)) == [list(range(16))] * 4
+    True
+    >>> list(get_block_words(b"1" * 512)) == [[4294967295] * 16]
+    True
+    >>> list(get_block_words(b""))
+    []
+    >>> list(get_block_words(b"1111"))
+    Traceback (most recent call last):
+    ...
+    ValueError: Input must have length that's a multiple of 512
    """
+    if len(bit_string) % 512 != 0:
+        raise ValueError("Input must have length that's a multiple of 512")

-    curr_pos = 0
-    while curr_pos < len(bit_string):
-        curr_part = bit_string[curr_pos : curr_pos + 512]
-        my_splits = []
-        for i in range(16):
-            my_splits.append(int(rearrange(curr_part[32 * i : 32 * i + 32]), 2))
-        yield my_splits
-        curr_pos += 512
+    for pos in range(0, len(bit_string), 512):
+        block = bit_string[pos : pos + 512]
+        block_words = []
+        for i in range(0, 512, 32):
+            block_words.append(int(to_little_endian(block[i : i + 32]), 2))
+        yield block_words


-def not32(i):
+def not_32(i: int) -> int:
    """
-    >>> not32(34)
+    Perform bitwise NOT on given int.
+
+    Arguments:
+        i {[int]} -- [given int]
+
+    Raises:
+        ValueError -- [input is negative]
+
+    Returns:
+        Result of bitwise NOT on i
+
+    >>> not_32(34)
    4294967261
+    >>> not_32(1234)
+    4294966061
+    >>> not_32(4294966061)
+    1234
+    >>> not_32(0)
+    4294967295
+    >>> not_32(1)
+    4294967294
+    >>> not_32(-1)
+    Traceback (most recent call last):
+    ...
+    ValueError: Input must be non-negative
    """
+    if i < 0:
+        raise ValueError("Input must be non-negative")
+
    i_str = format(i, "032b")
    new_str = ""
    for c in i_str:
@ -93,35 +225,114 @@ def not32(i):
    return int(new_str, 2)


-def sum32(a, b):
+def sum_32(a: int, b: int) -> int:
+    """
+    Add two numbers as 32-bit ints.
+
+    Arguments:
+        a {[int]} -- [first given int]
+        b {[int]} -- [second given int]
+
+    Returns:
+        (a + b) as an unsigned 32-bit int
+
+    >>> sum_32(1, 1)
+    2
+    >>> sum_32(2, 3)
+    5
+    >>> sum_32(0, 0)
+    0
+    >>> sum_32(-1, -1)
+    4294967294
+    >>> sum_32(4294967295, 1)
+    0
+    """
    return (a + b) % 2**32


-def leftrot32(i, s):
-    return (i << s) ^ (i >> (32 - s))
-
-
-def md5me(test_string):
-    """[summary]
-    Returns a 32-bit hash code of the string 'testString'
+def left_rotate_32(i: int, shift: int) -> int:
+    """
+    Rotate the bits of a given int left by a given amount.

    Arguments:
-            testString {[string]} -- [message]
+        i {[int]} -- [given int]
+        shift {[int]} -- [shift amount]
+
+    Raises:
+        ValueError -- [either given int or shift is negative]
+
+    Returns:
+        `i` rotated to the left by `shift` bits
+
+    >>> left_rotate_32(1234, 1)
+    2468
+    >>> left_rotate_32(1111, 4)
+    17776
+    >>> left_rotate_32(2147483648, 1)
+    1
+    >>> left_rotate_32(2147483648, 3)
+    4
+    >>> left_rotate_32(4294967295, 4)
+    4294967295
+    >>> left_rotate_32(1234, 0)
+    1234
+    >>> left_rotate_32(0, 0)
+    0
+    >>> left_rotate_32(-1, 0)
+    Traceback (most recent call last):
+    ...
+    ValueError: Input must be non-negative
+    >>> left_rotate_32(0, -1)
+    Traceback (most recent call last):
+    ...
+    ValueError: Shift must be non-negative
+    """
+    if i < 0:
+        raise ValueError("Input must be non-negative")
+    if shift < 0:
+        raise ValueError("Shift must be non-negative")
+    return ((i << shift) ^ (i >> (32 - shift))) % 2**32
+
+
+def md5_me(message: bytes) -> bytes:
+    """
+    Returns the 32-char MD5 hash of a given message.
+
+    Reference: https://en.wikipedia.org/wiki/MD5#Algorithm
+
+    Arguments:
+        message {[string]} -- [message]
+
+    Returns:
+        32-char MD5 hash string
+
+    >>> md5_me(b"")
+    b'd41d8cd98f00b204e9800998ecf8427e'
+    >>> md5_me(b"The quick brown fox jumps over the lazy dog")
+    b'9e107d9d372bb6826bd81d3542a419d6'
+    >>> md5_me(b"The quick brown fox jumps over the lazy dog.")
+    b'e4d909c290d0fb1ca068ffaddf22cbd0'
+
+    >>> import hashlib
+    >>> from string import ascii_letters
+    >>> msgs = [b"", ascii_letters.encode("utf-8"), "Üñîçø∂é".encode("utf-8"),
+    ...         b"The quick brown fox jumps over the lazy dog."]
+    >>> all(md5_me(msg) == hashlib.md5(msg).hexdigest().encode("utf-8") for msg in msgs)
+    True
    """

-    bs = ""
-    for i in test_string:
-        bs += format(ord(i), "08b")
-    bs = pad(bs)
+    # Convert to bit string, add padding and append message length
+    bit_string = preprocess(message)

-    tvals = [int(2**32 * abs(math.sin(i + 1))) for i in range(64)]
+    added_consts = [int(2**32 * abs(sin(i + 1))) for i in range(64)]

+    # Starting states
    a0 = 0x67452301
    b0 = 0xEFCDAB89
    c0 = 0x98BADCFE
    d0 = 0x10325476

-    s = [
+    shift_amounts = [
        7,
        12,
        17,
@ -188,51 +399,46 @@ def md5me(test_string):
        21,
    ]

-    for m in get_block(bs):
+    # Process bit string in chunks, each with 16 32-char words
+    for block_words in get_block_words(bit_string):
        a = a0
        b = b0
        c = c0
        d = d0
+
+        # Hash current chunk
        for i in range(64):
            if i <= 15:
-                # f = (B & C) | (not32(B) & D)
+                # f = (b & c) | (not_32(b) & d)     # Alternate definition for f
                f = d ^ (b & (c ^ d))
                g = i
            elif i <= 31:
-                # f = (D & B) | (not32(D) & C)
+                # f = (d & b) | (not_32(d) & c)     # Alternate definition for f
                f = c ^ (d & (b ^ c))
                g = (5 * i + 1) % 16
            elif i <= 47:
                f = b ^ c ^ d
                g = (3 * i + 5) % 16
            else:
-                f = c ^ (b | not32(d))
+                f = c ^ (b | not_32(d))
                g = (7 * i) % 16
-            dtemp = d
+            f = (f + a + added_consts[i] + block_words[g]) % 2**32
+            a = d
            d = c
            c = b
-            b = sum32(b, leftrot32((a + f + tvals[i] + m[g]) % 2**32, s[i]))
-            a = dtemp
-        a0 = sum32(a0, a)
-        b0 = sum32(b0, b)
-        c0 = sum32(c0, c)
-        d0 = sum32(d0, d)
+            b = sum_32(b, left_rotate_32(f, shift_amounts[i]))
+
+        # Add hashed chunk to running total
+        a0 = sum_32(a0, a)
+        b0 = sum_32(b0, b)
+        c0 = sum_32(c0, c)
+        d0 = sum_32(d0, d)

    digest = reformat_hex(a0) + reformat_hex(b0) + reformat_hex(c0) + reformat_hex(d0)
    return digest


-def test():
-    assert md5me("") == "d41d8cd98f00b204e9800998ecf8427e"
-    assert (
-        md5me("The quick brown fox jumps over the lazy dog")
-        == "9e107d9d372bb6826bd81d3542a419d6"
-    )
-    print("Success.")
-
-
 if __name__ == "__main__":
-    test()
    import doctest

    doctest.testmod()
--- a/neural_network/input_data.py_tf
+++ b/neural_network/input_data.py_tf
@ -21,13 +21,10 @@ This module and all its submodules are deprecated.
 import collections
 import gzip
 import os
+import urllib

 import numpy
-from six.moves import urllib
-from six.moves import xrange  # pylint: disable=redefined-builtin
-
-from tensorflow.python.framework import dtypes
-from tensorflow.python.framework import random_seed
+from tensorflow.python.framework import dtypes, random_seed
 from tensorflow.python.platform import gfile
 from tensorflow.python.util.deprecation import deprecated

@ -46,16 +43,16 @@ def _read32(bytestream):
 def _extract_images(f):
    """Extract the images into a 4D uint8 numpy array [index, y, x, depth].

-  Args:
-    f: A file object that can be passed into a gzip reader.
+    Args:
+      f: A file object that can be passed into a gzip reader.

-  Returns:
-    data: A 4D uint8 numpy array [index, y, x, depth].
+    Returns:
+      data: A 4D uint8 numpy array [index, y, x, depth].

-  Raises:
-    ValueError: If the bytestream does not start with 2051.
+    Raises:
+      ValueError: If the bytestream does not start with 2051.

-  """
+    """
    print("Extracting", f.name)
    with gzip.GzipFile(fileobj=f) as bytestream:
        magic = _read32(bytestream)
@ -86,17 +83,17 @@ def _dense_to_one_hot(labels_dense, num_classes):
 def _extract_labels(f, one_hot=False, num_classes=10):
    """Extract the labels into a 1D uint8 numpy array [index].

-  Args:
-    f: A file object that can be passed into a gzip reader.
-    one_hot: Does one hot encoding for the result.
-    num_classes: Number of classes for the one hot encoding.
+    Args:
+      f: A file object that can be passed into a gzip reader.
+      one_hot: Does one hot encoding for the result.
+      num_classes: Number of classes for the one hot encoding.

-  Returns:
-    labels: a 1D uint8 numpy array.
+    Returns:
+      labels: a 1D uint8 numpy array.

-  Raises:
-    ValueError: If the bystream doesn't start with 2049.
-  """
+    Raises:
+      ValueError: If the bystream doesn't start with 2049.
+    """
    print("Extracting", f.name)
    with gzip.GzipFile(fileobj=f) as bytestream:
        magic = _read32(bytestream)
@ -115,8 +112,8 @@ def _extract_labels(f, one_hot=False, num_classes=10):
 class _DataSet:
    """Container class for a _DataSet (deprecated).

-  THIS CLASS IS DEPRECATED.
-  """
+    THIS CLASS IS DEPRECATED.
+    """

    @deprecated(
        None,
@ -135,21 +132,21 @@ class _DataSet:
    ):
        """Construct a _DataSet.

-    one_hot arg is used only if fake_data is true.  `dtype` can be either
-    `uint8` to leave the input as `[0, 255]`, or `float32` to rescale into
-    `[0, 1]`.  Seed arg provides for convenient deterministic testing.
+        one_hot arg is used only if fake_data is true.  `dtype` can be either
+        `uint8` to leave the input as `[0, 255]`, or `float32` to rescale into
+        `[0, 1]`.  Seed arg provides for convenient deterministic testing.

-    Args:
-      images: The images
-      labels: The labels
-      fake_data: Ignore inages and labels, use fake data.
-      one_hot: Bool, return the labels as one hot vectors (if True) or ints (if
-        False).
-      dtype: Output image dtype. One of [uint8, float32]. `uint8` output has
-        range [0,255]. float32 output has range [0,1].
-      reshape: Bool. If True returned images are returned flattened to vectors.
-      seed: The random seed to use.
-    """
+        Args:
+          images: The images
+          labels: The labels
+          fake_data: Ignore inages and labels, use fake data.
+          one_hot: Bool, return the labels as one hot vectors (if True) or ints (if
+            False).
+          dtype: Output image dtype. One of [uint8, float32]. `uint8` output has
+            range [0,255]. float32 output has range [0,1].
+          reshape: Bool. If True returned images are returned flattened to vectors.
+          seed: The random seed to use.
+        """
        seed1, seed2 = random_seed.get_seed(seed)
        # If op level seed is not set, use whatever graph level seed is returned
        numpy.random.seed(seed1 if seed is None else seed2)
@ -206,8 +203,8 @@ class _DataSet:
            else:
                fake_label = 0
            return (
-                [fake_image for _ in xrange(batch_size)],
-                [fake_label for _ in xrange(batch_size)],
+                [fake_image for _ in range(batch_size)],
+                [fake_label for _ in range(batch_size)],
            )
        start = self._index_in_epoch
        # Shuffle for the first epoch
@ -250,19 +247,19 @@ class _DataSet:
 def _maybe_download(filename, work_directory, source_url):
    """Download the data from source url, unless it's already here.

-  Args:
-      filename: string, name of the file in the directory.
-      work_directory: string, path to working directory.
-      source_url: url to download from if file doesn't exist.
+    Args:
+        filename: string, name of the file in the directory.
+        work_directory: string, path to working directory.
+        source_url: url to download from if file doesn't exist.

-  Returns:
-      Path to resulting file.
-  """
+    Returns:
+        Path to resulting file.
+    """
    if not gfile.Exists(work_directory):
        gfile.MakeDirs(work_directory)
    filepath = os.path.join(work_directory, filename)
    if not gfile.Exists(filepath):
-        urllib.request.urlretrieve(source_url, filepath)
+        urllib.request.urlretrieve(source_url, filepath)  # noqa: S310
        with gfile.GFile(filepath) as f:
            size = f.size()
        print("Successfully downloaded", filename, size, "bytes.")
@ -328,7 +325,8 @@ def read_data_sets(

    if not 0 <= validation_size <= len(train_images):
        raise ValueError(
-            f"Validation size should be between 0 and {len(train_images)}. Received: {validation_size}."
+            f"Validation size should be between 0 and {len(train_images)}. "
+            f"Received: {validation_size}."
        )

    validation_images = train_images[:validation_size]
@ -336,7 +334,7 @@ def read_data_sets(
    train_images = train_images[validation_size:]
    train_labels = train_labels[validation_size:]

-    options = dict(dtype=dtype, reshape=reshape, seed=seed)
+    options = {"dtype": dtype, "reshape": reshape, "seed": seed}

    train = _DataSet(train_images, train_labels, **options)
    validation = _DataSet(validation_images, validation_labels, **options)
--- a/physics/grahams_law.py
+++ b/physics/grahams_law.py
@ -0,0 +1,208 @@
+"""
+Title: Graham's Law of Effusion
+
+Description: Graham's law of effusion states that the rate of effusion of a gas is
+inversely proportional to the square root of the molar mass of its particles:
+
+r1/r2 = sqrt(m2/m1)
+
+r1 = Rate of effusion for the first gas.
+r2 = Rate of effusion for the second gas.
+m1 = Molar mass of the first gas.
+m2 = Molar mass of the second gas.
+
+(Description adapted from https://en.wikipedia.org/wiki/Graham%27s_law)
+"""
+
+from math import pow, sqrt
+
+
+def validate(*values: float) -> bool:
+    """
+    Input Parameters:
+    -----------------
+    effusion_rate_1: Effustion rate of first gas (m^2/s, mm^2/s, etc.)
+    effusion_rate_2: Effustion rate of second gas (m^2/s, mm^2/s, etc.)
+    molar_mass_1: Molar mass of the first gas (g/mol, kg/kmol, etc.)
+    molar_mass_2: Molar mass of the second gas (g/mol, kg/kmol, etc.)
+
+    Returns:
+    --------
+    >>> validate(2.016, 4.002)
+    True
+    >>> validate(-2.016, 4.002)
+    False
+    >>> validate()
+    False
+    """
+    result = len(values) > 0 and all(value > 0.0 for value in values)
+    return result
+
+
+def effusion_ratio(molar_mass_1: float, molar_mass_2: float) -> float | ValueError:
+    """
+    Input Parameters:
+    -----------------
+    molar_mass_1: Molar mass of the first gas (g/mol, kg/kmol, etc.)
+    molar_mass_2: Molar mass of the second gas (g/mol, kg/kmol, etc.)
+
+    Returns:
+    --------
+    >>> effusion_ratio(2.016, 4.002)
+    1.408943
+    >>> effusion_ratio(-2.016, 4.002)
+    ValueError('Input Error: Molar mass values must greater than 0.')
+    >>> effusion_ratio(2.016)
+    Traceback (most recent call last):
+      ...
+    TypeError: effusion_ratio() missing 1 required positional argument: 'molar_mass_2'
+    """
+    return (
+        round(sqrt(molar_mass_2 / molar_mass_1), 6)
+        if validate(molar_mass_1, molar_mass_2)
+        else ValueError("Input Error: Molar mass values must greater than 0.")
+    )
+
+
+def first_effusion_rate(
+    effusion_rate: float, molar_mass_1: float, molar_mass_2: float
+) -> float | ValueError:
+    """
+    Input Parameters:
+    -----------------
+    effusion_rate: Effustion rate of second gas (m^2/s, mm^2/s, etc.)
+    molar_mass_1: Molar mass of the first gas (g/mol, kg/kmol, etc.)
+    molar_mass_2: Molar mass of the second gas (g/mol, kg/kmol, etc.)
+
+    Returns:
+    --------
+    >>> first_effusion_rate(1, 2.016, 4.002)
+    1.408943
+    >>> first_effusion_rate(-1, 2.016, 4.002)
+    ValueError('Input Error: Molar mass and effusion rate values must greater than 0.')
+    >>> first_effusion_rate(1)
+    Traceback (most recent call last):
+      ...
+    TypeError: first_effusion_rate() missing 2 required positional arguments: \
+'molar_mass_1' and 'molar_mass_2'
+    >>> first_effusion_rate(1, 2.016)
+    Traceback (most recent call last):
+      ...
+    TypeError: first_effusion_rate() missing 1 required positional argument: \
+'molar_mass_2'
+    """
+    return (
+        round(effusion_rate * sqrt(molar_mass_2 / molar_mass_1), 6)
+        if validate(effusion_rate, molar_mass_1, molar_mass_2)
+        else ValueError(
+            "Input Error: Molar mass and effusion rate values must greater than 0."
+        )
+    )
+
+
+def second_effusion_rate(
+    effusion_rate: float, molar_mass_1: float, molar_mass_2: float
+) -> float | ValueError:
+    """
+    Input Parameters:
+    -----------------
+    effusion_rate: Effustion rate of second gas (m^2/s, mm^2/s, etc.)
+    molar_mass_1: Molar mass of the first gas (g/mol, kg/kmol, etc.)
+    molar_mass_2: Molar mass of the second gas (g/mol, kg/kmol, etc.)
+
+    Returns:
+    --------
+    >>> second_effusion_rate(1, 2.016, 4.002)
+    0.709752
+    >>> second_effusion_rate(-1, 2.016, 4.002)
+    ValueError('Input Error: Molar mass and effusion rate values must greater than 0.')
+    >>> second_effusion_rate(1)
+    Traceback (most recent call last):
+      ...
+    TypeError: second_effusion_rate() missing 2 required positional arguments: \
+'molar_mass_1' and 'molar_mass_2'
+    >>> second_effusion_rate(1, 2.016)
+    Traceback (most recent call last):
+      ...
+    TypeError: second_effusion_rate() missing 1 required positional argument: \
+'molar_mass_2'
+    """
+    return (
+        round(effusion_rate / sqrt(molar_mass_2 / molar_mass_1), 6)
+        if validate(effusion_rate, molar_mass_1, molar_mass_2)
+        else ValueError(
+            "Input Error: Molar mass and effusion rate values must greater than 0."
+        )
+    )
+
+
+def first_molar_mass(
+    molar_mass: float, effusion_rate_1: float, effusion_rate_2: float
+) -> float | ValueError:
+    """
+    Input Parameters:
+    -----------------
+    molar_mass: Molar mass of the first gas (g/mol, kg/kmol, etc.)
+    effusion_rate_1: Effustion rate of first gas (m^2/s, mm^2/s, etc.)
+    effusion_rate_2: Effustion rate of second gas (m^2/s, mm^2/s, etc.)
+
+    Returns:
+    --------
+    >>> first_molar_mass(2, 1.408943, 0.709752)
+    0.507524
+    >>> first_molar_mass(-1, 2.016, 4.002)
+    ValueError('Input Error: Molar mass and effusion rate values must greater than 0.')
+    >>> first_molar_mass(1)
+    Traceback (most recent call last):
+      ...
+    TypeError: first_molar_mass() missing 2 required positional arguments: \
+'effusion_rate_1' and 'effusion_rate_2'
+    >>> first_molar_mass(1, 2.016)
+    Traceback (most recent call last):
+      ...
+    TypeError: first_molar_mass() missing 1 required positional argument: \
+'effusion_rate_2'
+    """
+    return (
+        round(molar_mass / pow(effusion_rate_1 / effusion_rate_2, 2), 6)
+        if validate(molar_mass, effusion_rate_1, effusion_rate_2)
+        else ValueError(
+            "Input Error: Molar mass and effusion rate values must greater than 0."
+        )
+    )
+
+
+def second_molar_mass(
+    molar_mass: float, effusion_rate_1: float, effusion_rate_2: float
+) -> float | ValueError:
+    """
+    Input Parameters:
+    -----------------
+    molar_mass: Molar mass of the first gas (g/mol, kg/kmol, etc.)
+    effusion_rate_1: Effustion rate of first gas (m^2/s, mm^2/s, etc.)
+    effusion_rate_2: Effustion rate of second gas (m^2/s, mm^2/s, etc.)
+
+    Returns:
+    --------
+    >>> second_molar_mass(2, 1.408943, 0.709752)
+    1.970351
+    >>> second_molar_mass(-2, 1.408943, 0.709752)
+    ValueError('Input Error: Molar mass and effusion rate values must greater than 0.')
+    >>> second_molar_mass(1)
+    Traceback (most recent call last):
+      ...
+    TypeError: second_molar_mass() missing 2 required positional arguments: \
+'effusion_rate_1' and 'effusion_rate_2'
+    >>> second_molar_mass(1, 2.016)
+    Traceback (most recent call last):
+      ...
+    TypeError: second_molar_mass() missing 1 required positional argument: \
+'effusion_rate_2'
+    """
+    return (
+        round(pow(effusion_rate_1 / effusion_rate_2, 2) / molar_mass, 6)
+        if validate(molar_mass, effusion_rate_1, effusion_rate_2)
+        else ValueError(
+            "Input Error: Molar mass and effusion rate values must greater than 0."
+        )
+    )
--- a/requirements.txt
+++ b/requirements.txt
@ -15,7 +15,7 @@ scikit-fuzzy
 scikit-learn
 statsmodels
 sympy
-tensorflow; python_version < "3.11"
+tensorflow
 texttable
 tweepy
 xgboost
Author	SHA1	Message	Date
Tianyi Zheng	33114f0272	Revamp `md5.py` (#8065 ) * Add type hints to md5.py * Rename some vars to snake case * Specify functions imported from math * Rename vars and functions to be more descriptive * Make tests from test function into doctests * Clarify more var names * Refactor some MD5 code into preprocess function * Simplify loop indices in get_block_words * Add more detailed comments, docs, and doctests * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * Add type hints to md5.py * Rename some vars to snake case * Specify functions imported from math * Rename vars and functions to be more descriptive * Make tests from test function into doctests * Clarify more var names * Refactor some MD5 code into preprocess function * Simplify loop indices in get_block_words * Add more detailed comments, docs, and doctests * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * Convert str types to bytes * Add tests comparing md5_me to hashlib's md5 * Replace line-break backslashes with parentheses --------- Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com>	2023-04-01 22:05:01 +02:00
Maxim Smolskiy	56a40eb3ee	Reenable files when TensorFlow supports the current Python (#8602 ) * Remove python_version < "3.11" for tensorflow * Reenable neural_network/input_data.py_tf * updating DIRECTORY.md * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Try to fix ruff * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Try to fix ruff * Try to fix ruff * Try to fix ruff * Try to fix pre-commit * Try to fix * Fix * Fix * Reenable dynamic_programming/k_means_clustering_tensorflow.py_tf * updating DIRECTORY.md * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Try to fix ruff --------- Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-04-01 19:43:11 +02:00
Blake Reimer	84b6852de8	Graham's Law (#8162 ) * grahams law * doctest and type hints * doctest formatting * peer review updates	2023-04-01 18:43:07 +02:00
Tianyi Zheng	a213cea5f5	Fix `mypy` errors in `dilation_operation.py` (#8595 ) * updating DIRECTORY.md * Fix mypy errors in dilation_operation.py * Rename functions to use snake case * updating DIRECTORY.md * updating DIRECTORY.md * Replace raw file string with pathlib Path * Update digital_image_processing/morphological_operations/dilation_operation.py Co-authored-by: Christian Clauss <cclauss@me.com> --------- Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com> Co-authored-by: Christian Clauss <cclauss@me.com>	2023-04-01 18:39:22 +02:00
Maxim Smolskiy	59cae167e0	Reduce the complexity of digital_image_processing/edge detection/canny.py (#8167 ) * Reduce the complexity of digital_image_processing/edge_detection/canny.py * Fix * updating DIRECTORY.md * updating DIRECTORY.md * updating DIRECTORY.md * Fix review issues * Rename dst to destination --------- Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com>	2023-04-01 18:22:33 +02:00