Record: February 2025

Sunday, February 23, 2025

Lossy Compression (image and audio)

Lossy compression works by ignoring details in the original file. The image is divided into a smaller block (say 8x8 bytes). The content in each block is analyzed and replaced by a smaller format. For example, if the block contain the same color, it is replaced by one number. If the block is dominant by a certain color, it is replaced by that color. If there is a gradient, it is replaced by a range.

Lossless Compression

General rule is to replace a long pattern with shorter pattern. It could be replacing a long repeating string with instruction to copy from the previous position with length or it could be replacing most frequent character from 2 bytes to 1 byte encoding but rare characters either more encoding bytes.

Sunday, February 16, 2025

Trapdoor Function

An investable function that is easy to compute the output from its inputs but difficult to derive the inputs from the output. An example is that two co-prime numbers multiply to give a product but is difficult to derive the 2 factors. Co-prime are 2 numbers that does not have a common factor. A trapdoor function is essential for asymmetric encryption.

Friday, February 7, 2025

Training with multiple GPU

Model parallelism has a to build a pipeline of GPU. Each GPU handles a layer of the model. This is used when the model is too big to fit on a single GPU memory. The efficiency is affected with bubble in the pipeline when data passing through GPU at different speed

Data parallelism is used if the model can fit on a single GPU. Data us the divided into mini-batch to run in ea h GPU. Gradients are computed and combined to adjust the weighs

Tensor parallelism is to map different part of the model to multiple GPU. It is different from model parallelism such that portion of layers and not the entire layer is map to a GPU Input is split to feed into different GPU based on the mapping boundary

Saturday, February 1, 2025

Speech Synthesis

Three steps to synthesis speech from text. First break down the text into words. It involve reducing ambiguity by normalizing elements like numbers, dates, times, abbreviations and special characters etc into words. Try w second step is to break down words into phonemes (distinct unit of sound). The third step is to synthesize sound.

There are 3 types of technics to generate sound. Concatenative is to string up pre-recorded sound. Formant is to add a combination of synthesized frequencies and shapes using model. Articulatory synthesis models human vocal tract to generate sound.

Record