четверг, 4 июня 2026 г.

Векторные операции на DSP c66x

Ссылки:
1) Программирование многоядерных DSP-процессоров TMS320C66x с использованием OpenMP https://habr.com/ru/articles/318762/

pdf:
1) Sparse Matrix-Vector Multiply on the Texas Instruments C6678 Digital Signal Processor https://pdfs.semanticscholar.org/6617/964cd7ead75d18a7b25dcc04c222abdce1f9.pdf
2) Optimising loops in c66 https://www.ti.com/lit/pdf/sprabg7

текстовое описание функций c66x:
1) описание (абзац TMS320C6600 C/C++ Compiler Intrinsics в TMS320C6000 Optimizing Compiler
v8.2.x. User's Guide) https://www.ti.com/lit/pdf/spru187
2) dotpsu4h

long long _ddotpsu4h (__x128_t src1, __x128_t src2 ); - Performs two dot-products between four sets of packed 16-bit values. (Two-way _dotpsu4h)

int _dotpsu4h (long long src1, long long src2); - Multiply four signed 16-bit values by four unsigned 16-bit values and return the 32-bit sum.

long long _dotpsu4hll (long long src1, long long src2); - Multiply four signed 16-bit values by four unsigned 16-bit values and return the 64-bit sum.

3) dmpy2

__x128_t _dmpy2 (long long src1, long long src2); - Four-way SIMD multiply of signed 16-bit values producing four signed 32-bit results. (Two-way _mpy2)

4) dadd2

long long _dadd2 (long long src1, long long src2); - Four-way SIMD addition of signed 16-bit values producing four signed 32-bit results. (Two-way _add2)

векторные функции c66x:

Комментариев нет:

Отправить комментарий