Initial import
This commit is contained in:
26
third_party/avir/LICENSE
vendored
Normal file
26
third_party/avir/LICENSE
vendored
Normal file
@@ -0,0 +1,26 @@
|
||||
AVIR License Agreement
|
||||
|
||||
The MIT License (MIT)
|
||||
|
||||
AVIR Copyright (c) 2015-2019 Aleksey Vaneev
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
||||
Please credit the author of this library in your documentation in the
|
||||
following way: "AVIR image resizing algorithm designed by Aleksey Vaneev"
|
||||
5
third_party/avir/README.cosmo
vendored
Normal file
5
third_party/avir/README.cosmo
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
commit 7dd9515ef6aed6fb6d565ee12754703bdc46b3b0
|
||||
Author: Aleksey Vaneev <aleksey.vaneev@gmail.com>
|
||||
Date: Mon Jul 29 07:43:23 2019 +0300
|
||||
|
||||
Version 2.4 release.
|
||||
367
third_party/avir/README.md
vendored
Normal file
367
third_party/avir/README.md
vendored
Normal file
@@ -0,0 +1,367 @@
|
||||
# AVIR #
|
||||
## Introduction ##
|
||||
Keywords: image resize, image resizer, image resizing, image scaling,
|
||||
image scaler, image resize c++, image resizer c++
|
||||
|
||||
Please consider supporting the author on [Patreon](https://www.patreon.com/aleksey_vaneev).
|
||||
|
||||
Me, Aleksey Vaneev, is happy to offer you an open source image resizing /
|
||||
scaling library which has reached a production level of quality, and is
|
||||
ready to be incorporated into any project. This library features routines
|
||||
for both down- and upsizing of 8- and 16-bit, 1 to 4-channel images. Image
|
||||
resizing routines were implemented in multi-platform C++ code, and have a
|
||||
high level of optimality. Beside resizing, this library offers a sub-pixel
|
||||
shift operation. Built-in sRGB gamma correction is available.
|
||||
|
||||
The resizing algorithm at first produces 2X upsized image (relative to the
|
||||
source image size, or relative to the destination image size if downsizing is
|
||||
performed) and then performs interpolation using a bank of sinc function-based
|
||||
fractional delay filters. At the last stage a correction filter is applied
|
||||
which fixes smoothing introduced at previous steps.
|
||||
|
||||
The resizing algorithm was designed to provide the best visual quality. The
|
||||
author even believes this algorithm provides the "ultimate" level of
|
||||
quality (for an orthogonal resizing) which cannot be increased further: no
|
||||
math exists to provide a better frequency response, better anti-aliasing
|
||||
quality and at the same time having less ringing artifacts: these are 3
|
||||
elements that define any resizing algorithm's quality; in AVIR practice these
|
||||
elements have a high correlation to each other, so they can be represented by
|
||||
a single parameter (AVIR offers several parameter sets with varying quality).
|
||||
Algorithm's time performance turned out to be very good as well (for the
|
||||
"ultimate" image quality).
|
||||
|
||||
An important element utilized by this algorithm is the so called Peaked Cosine
|
||||
window function, which is applied over sinc function in all filters. Please
|
||||
consult the documentation for more details.
|
||||
|
||||
Note that since AVIR implements orthogonal resizing, it may exhibit diagonal
|
||||
aliasing artifacts. These artifacts are usually suppressed by EWA or radial
|
||||
filtering techniques. EWA-like technique is not implemented in AVIR, because
|
||||
it requires considerably more computing resources and may produce a blurred
|
||||
image.
|
||||
|
||||
As a bonus, a faster `LANCIR` image resizing algorithm is also offered as a
|
||||
part of this library. But the main focus of this documentation is the original
|
||||
AVIR image resizing algorithm.
|
||||
|
||||
AVIR does not offer affine and non-linear image transformations "out of the
|
||||
box". Since upsizing is a relatively fast operation in AVIR (required time
|
||||
scales linearly with the output image area), affine and non-linear
|
||||
transformations can be implemented in steps: 4- to 8-times upsizing,
|
||||
transformation via bilinear interpolation, downsizing (linear proportional
|
||||
affine transformations can probably skip the downsizing step). This should not
|
||||
compromise the transformation quality much as bilinear interpolation's
|
||||
problems will mostly reside in spectral area without useful signal, with a
|
||||
maximum of 0.7 dB high-frequency attenuation for 4-times upsizing, and 0.17 dB
|
||||
attenuation for 8-times upsizing. This approach is probably as time efficient
|
||||
as performing a high-quality transform over the input image directly (the only
|
||||
serious drawback is the increased memory requirement). Note that affine
|
||||
transformations that change image proportions should first apply proportion
|
||||
change during upsizing.
|
||||
|
||||
*AVIR is devoted to women. Your digital photos can look good at any size!*
|
||||
|
||||
## Requirements ##
|
||||
C++ compiler and system with efficient "float" floating point (24-bit
|
||||
mantissa) type support. This library can also internally use the "double" and
|
||||
SIMD floating point types during resizing if needed. This library does not
|
||||
have dependencies beside the standard C library.
|
||||
|
||||
## Links ##
|
||||
* [Documentation](https://www.voxengo.com/public/avir/Documentation/)
|
||||
|
||||
## Usage Information ##
|
||||
The image resizer is represented by the `avir::CImageResizer<>` class, which
|
||||
is a single front-end class for the whole library. Basically, you do not need
|
||||
to use nor understand any other classes beside this class.
|
||||
|
||||
The code of the library resides in the "avir" C++ namespace, effectively
|
||||
isolating it from all other code. The code is thread-safe. You need just
|
||||
a single resizer object per running application, at any time, even when
|
||||
resizing images concurrently.
|
||||
|
||||
To resize images in your application, simply add 3 lines of code:
|
||||
|
||||
#include "avir.h"
|
||||
avir :: CImageResizer<> ImageResizer( 8 );
|
||||
ImageResizer.resizeImage( InBuf, 640, 480, 0, OutBuf, 1024, 768, 3, 0 );
|
||||
(multi-threaded operation requires additional coding, see the documentation)
|
||||
|
||||
For low-ringing performance:
|
||||
|
||||
avir :: CImageResizer<> ImageResizer( 8, 0, avir :: CImageResizerParamsLR() );
|
||||
|
||||
To use the built-in gamma correction, an object of the
|
||||
`avir::CImageResizerVars` class with its variable `UseSRGBGamma` set to "true"
|
||||
should be supplied to the `resizeImage()` function. Note that the gamma
|
||||
correction is applied to all channels (e.g. alpha-channel) in the current
|
||||
implementation.
|
||||
|
||||
avir :: CImageResizerVars Vars;
|
||||
Vars.UseSRGBGamma = true;
|
||||
|
||||
Dithering (error-diffusion dither which is perceptually good) can be enabled
|
||||
this way:
|
||||
|
||||
typedef avir :: fpclass_def< float, float,
|
||||
avir :: CImageResizerDithererErrdINL< float > > fpclass_dith;
|
||||
avir :: CImageResizer< fpclass_dith > ImageResizer( 8 );
|
||||
|
||||
The library is able to process images of any bit depth: this includes 8-bit,
|
||||
16-bit, float and double types. Larger integer and signed integer types are
|
||||
not supported. Supported source and destination image sizes are only limited
|
||||
by the available system memory.
|
||||
|
||||
The code of this library was commented in the [Doxygen](http://www.doxygen.org/)
|
||||
style. To generate the documentation locally you may run the
|
||||
`doxygen ./other/avirdoxy.txt` command from the library's directory. Note that
|
||||
the code was suitably documented allowing you to make modifications, and to
|
||||
gain full understanding of the algorithm.
|
||||
|
||||
Preliminary tests show that this library (compiled with Intel C++ Compiler
|
||||
18.2 with AVX2 instructions enabled, without explicit SIMD resizing code) can
|
||||
resize 8-bit RGB 5184x3456 (17.9 Mpixel) 3-channel image down to 1920x1280
|
||||
(2.5 Mpixel) image in 245 milliseconds, utilizing a single thread, on Intel
|
||||
Core i7-7700K processor-based system without overclocking. This scales down to
|
||||
74 milliseconds if 8 threads are utilized.
|
||||
|
||||
Multi-threaded operation is not provided by this library "out of the box".
|
||||
The multi-threaded (horizontally-threaded) infrastructure is available, but
|
||||
requires additional system-specific interfacing code for engagement.
|
||||
|
||||
## SIMD Usage Information ##
|
||||
This library is capable of using SIMD floating point types for internal
|
||||
variables. This means that up to 4 color channels can be processed in
|
||||
parallel. Since the default interleaved processing algorithm itself remains
|
||||
non-SIMD, the use of SIMD internal types is not practical for 1- and 2-channel
|
||||
image resizing (due to overhead). SIMD internal type can be used this way:
|
||||
|
||||
#include "avir_float4_sse.h"
|
||||
avir :: CImageResizer< avir :: fpclass_float4 > ImageResizer( 8 );
|
||||
|
||||
For 1-channel and 2-channel image resizing when AVX instructions are allowed
|
||||
it may be reasonable to utilize de-interleaved SIMD processing algorithm.
|
||||
While it gives no performance benefit if the "float4" SSE processing type is
|
||||
used, it offers some performance boost if the "float8" AVX processing type is
|
||||
used (given dithering is not performed, or otherwise performance is reduced at
|
||||
the dithering stage since recursive dithering cannot be parallelized). The
|
||||
internal type remains non-SIMD "float". De-interleaved algorithm can be used
|
||||
this way:
|
||||
|
||||
#include "avir_float8_avx.h"
|
||||
avir :: CImageResizer< avir :: fpclass_float8_dil > ImageResizer( 8 );
|
||||
|
||||
It's important to note that on the latest Intel processors (i7-7700K and
|
||||
probably later) the use of the aforementioned SIMD-specific resizing code may
|
||||
not be justifiable, or may be even counter-productive due to many factors:
|
||||
memory bandwidth bottleneck, increased efficiency of processor's circuitry
|
||||
utilization and out-of-order execution, automatic SIMD optimizations performed
|
||||
by the compiler. This is at least true when compiling 64-bit code with Intel
|
||||
C++ Compiler 18.2 with /QxSSE4.2, or especially with the /QxCORE-AVX2 option.
|
||||
SSE-specific resizing code may still be a little bit more efficient for
|
||||
4-channel image resizing.
|
||||
|
||||
## Notes ##
|
||||
This library was tested for compatibility with [GNU C++](http://gcc.gnu.org/),
|
||||
[Microsoft Visual C++](http://www.microsoft.com/visualstudio/eng/products/visual-studio-express-products)
|
||||
and [Intel C++](http://software.intel.com/en-us/c-compilers) compilers, on 32-
|
||||
and 64-bit Windows, macOS and CentOS Linux. The code was also tested with
|
||||
Dr.Memory/Win32 for the absence of uninitialized or unaddressable memory
|
||||
accesses.
|
||||
|
||||
All code is fully "inline", without the need to compile any source files. The
|
||||
memory footprint of the library itself is very modest, except that the size of
|
||||
the temporary image buffers depends on the input and output image sizes, and
|
||||
is proportionally large.
|
||||
|
||||
The "heart" of resizing algorithm's quality resides in the parameters defined
|
||||
via the `avir::CImageResizerParams` structure. While the default set of
|
||||
parameters that offers a good quality was already provided, there is
|
||||
(probably) still a place for improvement exists, and the default parameters
|
||||
may change in a future update. If you need to recall an exact set of
|
||||
parameters, simply save them locally for a later use.
|
||||
|
||||
When the algorithm is run with no resizing applied (k=1), the result of
|
||||
resizing will not be an exact, but a very close copy of the source image. The
|
||||
reason for such inexactness is that the image is always low-pass filtered at
|
||||
first to reduce aliasing during subsequent resizing, and at last filtered by a
|
||||
correction filter. Such approach allows algorithm to maintain a stable level
|
||||
of quality regardless of the resizing "k" factor used.
|
||||
|
||||
This library includes a binary command line tool "imageresize" for major
|
||||
desktop platforms. This tool was designed to be used as a demonstration of
|
||||
library's performance, and as a reference, it is multi-threaded (the `-t`
|
||||
switch can be used to control the number of threads utilized). This tool uses
|
||||
plain "float" processing (no explicit SIMD) and relies on automatic compiler
|
||||
optimization (with Win64 binary being the "main" binary as it was compiled
|
||||
with the best ICC optimization options for the time being). This tool uses the
|
||||
following libraries:
|
||||
* turbojpeg Copyright (c) 2009-2013 D. R. Commander
|
||||
* libpng Copyright (c) 1998-2013 Glenn Randers-Pehrson
|
||||
* zlib Copyright (c) 1995-2013 Jean-loup Gailly and Mark Adler
|
||||
|
||||
Note that you can enable gamma-correction with the `-g` switch. However,
|
||||
sometimes gamma-correction produces "greenish/reddish/bluish haze" since
|
||||
low-amplitude oscillations produced by resizing at object boundaries are
|
||||
amplified by gamma correction. This can also have an effect of reduced
|
||||
contrast.
|
||||
|
||||
## Interpolation Discussion ##
|
||||
The use of certain low-pass filters and 2X upsampling in this library is
|
||||
hardly debatable, because they are needed to attain a certain anti-aliasing
|
||||
effect and keep ringing artifacts low. But the use of sinc function-based
|
||||
interpolation filter that is 18 taps-long (may be higher, up to 36 taps in
|
||||
practice) can be questioned, because even in 0th order case such
|
||||
interpolation filter requires 18 multiply-add operations. Comparatively, an
|
||||
optimal Hermite or cubic interpolation spline requires 8 multiply and 11 add
|
||||
operations.
|
||||
|
||||
One of the reasons 18-tap filter is preferred, is because due to memory
|
||||
bandwidth limitations using a lower-order filter does not provide any
|
||||
significant performance increase (e.g. 14-tap filter is less than 5% more
|
||||
efficient overall). At the same time, in comparison to cubic spline, 18-tap
|
||||
filter embeds a low-pass filter that rejects signal above 0.5\*pi (provides
|
||||
additional anti-aliasing filtering), and this filter has a consistent shape at
|
||||
all fractional offsets. Splines have a varying low-pass filter shape at
|
||||
different fractional offsets (e.g. no low-pass filtering at 0.0 offset,
|
||||
and maximal low-pass filtering at 0.5 offset). 18-tap filter also offers a
|
||||
superior stop-band attenuation which almost guarantees absence of artifacts if
|
||||
the image is considerably sharpened afterwards.
|
||||
|
||||
## Why 2X upsizing in AVIR? ##
|
||||
Classic approaches to image resizing do not perform an additional 2X upsizing.
|
||||
So, why such upsizing is needed at all in AVIR? Indeed, image resizing can be
|
||||
implemented using a single interpolation filter which is applied to the source
|
||||
image directly. However, such approach has limitations:
|
||||
|
||||
First of all, speaking about non-2X-upsized resizing, during upsizing the
|
||||
interpolation filter has to be tuned to a frequency close to pi (Nyquist) in
|
||||
order to reduce high-frequency smoothing: this reduces the space left for
|
||||
filter optimization. Beside that, during downsizing, a filter that performs
|
||||
well and predictable when tuned to frequencies close to the Nyquist frequency,
|
||||
may become distorted in its spectral shape when it is tuned to lower
|
||||
frequencies. That is why it is usually a good idea to have filter's stop-band
|
||||
begin below Nyquist so that the transition band's shape remains stable at any
|
||||
lower-frequency setting. At the same time, this requirement complicates a
|
||||
further corrective filtering, because correction filter may become too steep
|
||||
at the point where the stop-band begins.
|
||||
|
||||
Secondly, speaking about non-2X-upsized resizing, filter has to be very short
|
||||
(with a base length of 5-7 taps, further multiplied by the resizing factor) or
|
||||
otherwise the ringing artifacts will be very strong: it is a general rule that
|
||||
the steeper the filter is around signal frequencies being removed the higher
|
||||
the ringing artifacts are. That is why it is preferred to move steep
|
||||
transitions into the spectral area with a quieter signal. A short filter also
|
||||
means it cannot provide a strong "beyond-Nyquist" stop-band attenuation, so an
|
||||
interpolated image will look a bit edgy or not very clean due to stop-band
|
||||
artifacts.
|
||||
|
||||
To sum up, only additional controlled 2X upsizing provides enough spectral
|
||||
space to design interpolation filter without visible ringing artifacts yet
|
||||
providing a strong stop-band attenuation and stable spectral characteristics
|
||||
(good at any resizing "k" factor). Moreover, 2X upsizing becomes very
|
||||
important in maintaining a good resizing quality when downsizing and upsizing
|
||||
by small "k" factors, in the range 0.5 to 2: resizing approaches that do not
|
||||
perform 2X upsizing usually cannot design a good interpolation filter for such
|
||||
factors just because there is not enough spectral space available.
|
||||
|
||||
## Why Peaked Cosine in AVIR? ##
|
||||
First of all, AVIR is a general solution to image resizing problem. That is
|
||||
why it should not be directly compared to "spline interpolation" or "Lanczos
|
||||
resampling", because the latter two are only means to design interpolation
|
||||
filters, and they can be implemented in a variety of ways, even in sub-optimal
|
||||
ways. Secondly, with only a minimal effort AVIR can be changed to use any
|
||||
existing interpolation formula and any window function, but this is just not
|
||||
needed.
|
||||
|
||||
An effort was made to compare Peaked Cosine to Lanczos window function, and
|
||||
here is the author's opinion. Peaked Cosine has two degrees of freedom whereas
|
||||
Lanczos has one degree of freedom. While both functions can be used with
|
||||
acceptable results, Peaked Cosine window function used in automatic parameter
|
||||
optimization really pushes the limits of frequency response linearity,
|
||||
anti-aliasing strength (stop-band attenuation) and low-ringing performance
|
||||
which Lanczos cannot usually achieve. This is true at least when using a
|
||||
general-purpose downhill simplex optimization method. Lanczos window has good
|
||||
(but not better) characteristics in several special cases (certain "k"
|
||||
factors) which makes it of limited use in a general solution such as AVIR.
|
||||
|
||||
Among other window functions (Kaiser, Gaussian, Cauchy, Poisson, generalized
|
||||
cosine windows) there are no better candidates as well. It looks like Peaked
|
||||
Cosine function's scalability (it retains stable, almost continously-variable
|
||||
spectral characteristics at any window parameter values), and its ability to
|
||||
create "desirable" pass-band ripple in the frequency response near the cutoff
|
||||
point contribute to its better overall quality. Somehow Peaked Cosine window
|
||||
function optimization manages to converge to reasonable states in most cases
|
||||
(that is why AVIR library comes with a set of equally robust, but distinctive
|
||||
parameter sets) whereas all other window functions tend to produce
|
||||
unpredictable optimization results.
|
||||
|
||||
The only disadvantage of Peaked Cosine window function is that usable filters
|
||||
windowed by this function tend to be longer than "usual" (with Kaiser window
|
||||
being the "golden standard" for filter length per decibel of stop-band
|
||||
attenuation). This is a price that should be paid for stable spectral
|
||||
characteristics.
|
||||
|
||||
## LANCIR ##
|
||||
|
||||
As a part of AVIR library, the `CLancIR` class is also offered which is an
|
||||
optimal implementation of *Lanczos* image resizing filter. This class has a
|
||||
similar programmatic interface to AVIR, but it is not thread-safe: each
|
||||
executing thread should have its own `CLancIR` object. This class was designed
|
||||
for cases of batch processing of same-sized frames like in video encoding.
|
||||
|
||||
LANCIR offers up to 200% faster image resizing in comparison to AVIR. The
|
||||
quality difference is, however, debatable. Note that while LANCIR can take
|
||||
8- and 16-bit and float image buffers, its precision is limited to 8-bit
|
||||
resizing.
|
||||
|
||||
LANCIR should be seen as a bonus and as some kind of quality comparison.
|
||||
LANCIR uses Lanczos filter "a" parameter equal to 3 which is similar to AVIR's
|
||||
default setting.
|
||||
|
||||
## Change log ##
|
||||
Version 2.4:
|
||||
|
||||
* Removed outdated `_mm_reset()` function calls from the SIMD code.
|
||||
* Changed `float4 round()` to use SSE2 rounding features, avoiding use of
|
||||
64-bit registers.
|
||||
|
||||
Version 2.3:
|
||||
|
||||
* Implemented CLancIR image resizing algorithm.
|
||||
* Fixed a minor image offset on image upsizing.
|
||||
|
||||
Version 2.2:
|
||||
|
||||
* Released AVIR under a permissive MIT license agreement.
|
||||
|
||||
Version 2.1:
|
||||
|
||||
* Fixed error-diffusion dither problems introduced in the previous version.
|
||||
* Added the `-1` switch to the `imageresize` to enable 1-bit output for
|
||||
dither's quality evaluation (use together with the `-d` switch).
|
||||
* Added the `--algparams=` switch to the `imageresize` to control resizing
|
||||
quality (replaces the `--low-ring` switch).
|
||||
* Added `avir :: CImageResizerParamsULR` parameter set for lowest-ringing
|
||||
performance possible (not considerably different to
|
||||
`avir :: CImageResizerParamsLR`, but a bit lower ringing).
|
||||
|
||||
Version 2.0:
|
||||
|
||||
* Minor inner loop optimizations.
|
||||
* Lifted the supported image size constraint by switching buffer addressing to
|
||||
`size_t` from `int`, now image size is limited by the available system memory.
|
||||
* Added several useful switches to the `imageresize` utility.
|
||||
* Now `imageresize` does not apply gamma-correction by default.
|
||||
* Fixed scaling of bit depth-reduction operation.
|
||||
* Improved error-diffusion dither's signal-to-noise ratio.
|
||||
* Compiled binaries with AVX2 instruction set (SSE4 for macOS).
|
||||
|
||||
## Users ##
|
||||
This library is used by:
|
||||
|
||||
* [Contaware.com](http://www.contaware.com/)
|
||||
|
||||
Please drop me a note at aleksey.vaneev@gmail.com and I will include a link to
|
||||
your software product to the list of users. This list is important at
|
||||
maintaining confidence in this library among the interested parties.
|
||||
17065
third_party/avir/avir.h
vendored
Normal file
17065
third_party/avir/avir.h
vendored
Normal file
File diff suppressed because it is too large
Load Diff
71
third_party/avir/avir.mk
vendored
Normal file
71
third_party/avir/avir.mk
vendored
Normal file
@@ -0,0 +1,71 @@
|
||||
#-*-mode:makefile-gmake;indent-tabs-mode:t;tab-width:8;coding:utf-8-*-┐
|
||||
#───vi: set et ft=make ts=8 tw=8 fenc=utf-8 :vi───────────────────────┘
|
||||
|
||||
PKGS += THIRD_PARTY_AVIR
|
||||
|
||||
THIRD_PARTY_AVIR_ARTIFACTS += THIRD_PARTY_AVIR_A
|
||||
THIRD_PARTY_AVIR = $(THIRD_PARTY_AVIR_A_DEPS) $(THIRD_PARTY_AVIR_A)
|
||||
THIRD_PARTY_AVIR_A = o/$(MODE)/third_party/avir/avir.a
|
||||
THIRD_PARTY_AVIR_A_CHECKS = $(THIRD_PARTY_AVIR_A).pkg
|
||||
THIRD_PARTY_AVIR_A_FILES := $(wildcard third_party/avir/*)
|
||||
THIRD_PARTY_AVIR_A_SRCS_S = $(filter %.S,$(THIRD_PARTY_AVIR_A_FILES))
|
||||
THIRD_PARTY_AVIR_A_SRCS_C = $(filter %.c,$(THIRD_PARTY_AVIR_A_FILES))
|
||||
THIRD_PARTY_AVIR_A_SRCS_X = $(filter %.cc,$(THIRD_PARTY_AVIR_A_FILES))
|
||||
|
||||
THIRD_PARTY_AVIR_A_HDRS = \
|
||||
$(filter %.h,$(THIRD_PARTY_AVIR_A_FILES)) \
|
||||
$(filter %.hpp,$(THIRD_PARTY_AVIR_A_FILES))
|
||||
|
||||
THIRD_PARTY_AVIR_A_SRCS = \
|
||||
$(THIRD_PARTY_AVIR_A_SRCS_S) \
|
||||
$(THIRD_PARTY_AVIR_A_SRCS_C) \
|
||||
$(THIRD_PARTY_AVIR_A_SRCS_X)
|
||||
|
||||
THIRD_PARTY_AVIR_A_OBJS = \
|
||||
$(THIRD_PARTY_AVIR_A_SRCS:%=o/$(MODE)/%.zip.o) \
|
||||
$(THIRD_PARTY_AVIR_A_SRCS_S:%.S=o/$(MODE)/%.o) \
|
||||
$(THIRD_PARTY_AVIR_A_SRCS_C:%.c=o/$(MODE)/%.o) \
|
||||
$(THIRD_PARTY_AVIR_A_SRCS_X:%.cc=o/$(MODE)/%.o)
|
||||
|
||||
THIRD_PARTY_AVIR_A_DIRECTDEPS = \
|
||||
DSP_CORE \
|
||||
LIBC_NEXGEN32E \
|
||||
LIBC_BITS \
|
||||
LIBC_MEM \
|
||||
LIBC_CALLS \
|
||||
LIBC_STUBS \
|
||||
LIBC_SYSV \
|
||||
LIBC_FMT \
|
||||
LIBC_UNICODE \
|
||||
LIBC_LOG \
|
||||
LIBC_TINYMATH
|
||||
|
||||
$(THIRD_PARTY_AVIR_A).pkg: \
|
||||
$(THIRD_PARTY_AVIR_A_OBJS) \
|
||||
$(foreach x,$(THIRD_PARTY_AVIR_A_DIRECTDEPS),$($(x)_A).pkg)
|
||||
|
||||
$(THIRD_PARTY_AVIR_A): \
|
||||
third_party/avir/ \
|
||||
$(THIRD_PARTY_AVIR_A).pkg \
|
||||
$(THIRD_PARTY_AVIR_A_OBJS)
|
||||
|
||||
#o/$(MODE)/third_party/avir/lanczos1b.o: \
|
||||
CXX = clang++-10
|
||||
|
||||
o/$(MODE)/third_party/avir/lanczos1b.o \
|
||||
o/$(MODE)/third_party/avir/lanczos.o: \
|
||||
OVERRIDE_CXXFLAGS += \
|
||||
$(MATHEMATICAL)
|
||||
|
||||
THIRD_PARTY_AVIR_A_DEPS := \
|
||||
$(call uniq,$(foreach x,$(THIRD_PARTY_AVIR_A_DIRECTDEPS),$($(x))))
|
||||
|
||||
THIRD_PARTY_AVIR_LIBS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)))
|
||||
THIRD_PARTY_AVIR_SRCS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_SRCS))
|
||||
THIRD_PARTY_AVIR_HDRS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_HDRS))
|
||||
THIRD_PARTY_AVIR_CHECKS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_CHECKS))
|
||||
THIRD_PARTY_AVIR_OBJS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_OBJS))
|
||||
THIRD_PARTY_AVIR_TESTS = $(foreach x,$(THIRD_PARTY_AVIR_ARTIFACTS),$($(x)_TESTS))
|
||||
|
||||
.PHONY: o/$(MODE)/third_party/avir
|
||||
o/$(MODE)/third_party/avir: $(THIRD_PARTY_AVIR_A_CHECKS)
|
||||
18
third_party/avir/avir1.h
vendored
Normal file
18
third_party/avir/avir1.h
vendored
Normal file
@@ -0,0 +1,18 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_AVIR1_H_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_AVIR1_H_
|
||||
#if !(__ASSEMBLER__ + __LINKER__ + 0)
|
||||
COSMOPOLITAN_C_START_
|
||||
|
||||
struct avir1 {
|
||||
void *p;
|
||||
};
|
||||
|
||||
void avir1init(struct avir1 *self);
|
||||
void avir1free(struct avir1 *self);
|
||||
void avir1(struct avir1 *resizer, size_t dyn, size_t dxn, void *dst,
|
||||
size_t dstsize, size_t syn, size_t sxn, size_t ssw, const void *src,
|
||||
size_t srcsize);
|
||||
|
||||
COSMOPOLITAN_C_END_
|
||||
#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_AVIR1_H_ */
|
||||
1013
third_party/avir/avir_dil.h
vendored
Normal file
1013
third_party/avir/avir_dil.h
vendored
Normal file
File diff suppressed because it is too large
Load Diff
323
third_party/avir/avir_float4_sse.h
vendored
Normal file
323
third_party/avir/avir_float4_sse.h
vendored
Normal file
@@ -0,0 +1,323 @@
|
||||
/* clang-format off */
|
||||
//$ nobt
|
||||
//$ nocpp
|
||||
|
||||
/**
|
||||
* @file avir_float4_sse.h
|
||||
*
|
||||
* @brief Inclusion file for the "float4" type.
|
||||
*
|
||||
* This file includes the "float4" SSE-based type used for SIMD variable
|
||||
* storage and processing.
|
||||
*
|
||||
* AVIR Copyright (c) 2015-2019 Aleksey Vaneev
|
||||
*/
|
||||
|
||||
#ifndef AVIR_FLOAT4_SSE_INCLUDED
|
||||
#define AVIR_FLOAT4_SSE_INCLUDED
|
||||
|
||||
#include "third_party/avir/avir.h"
|
||||
#include "libc/bits/mmintrin.h"
|
||||
#include "libc/bits/xmmintrin.h"
|
||||
#include "libc/bits/xmmintrin.h"
|
||||
#include "libc/bits/emmintrin.h"
|
||||
|
||||
namespace avir {
|
||||
|
||||
/**
|
||||
* @brief SIMD packed 4-float type.
|
||||
*
|
||||
* This class implements a packed 4-float type that can be used to perform
|
||||
* parallel computation using SIMD instructions on SSE-enabled processors.
|
||||
* This class can be used as the "fptype" argument of the avir::fpclass_def
|
||||
* class.
|
||||
*/
|
||||
|
||||
class float4
|
||||
{
|
||||
public:
|
||||
float4()
|
||||
{
|
||||
}
|
||||
|
||||
float4( const float4& s )
|
||||
: value( s.value )
|
||||
{
|
||||
}
|
||||
|
||||
float4( const __m128 s )
|
||||
: value( s )
|
||||
{
|
||||
}
|
||||
|
||||
float4( const float s )
|
||||
: value( _mm_set1_ps( s ))
|
||||
{
|
||||
}
|
||||
|
||||
float4& operator = ( const float4& s )
|
||||
{
|
||||
value = s.value;
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float4& operator = ( const __m128 s )
|
||||
{
|
||||
value = s;
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float4& operator = ( const float s )
|
||||
{
|
||||
value = _mm_set1_ps( s );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
operator float () const
|
||||
{
|
||||
return( _mm_cvtss_f32( value ));
|
||||
}
|
||||
|
||||
/**
|
||||
* @param p Pointer to memory from where the value should be loaded,
|
||||
* should be 16-byte aligned.
|
||||
* @return float4 value loaded from the specified memory location.
|
||||
*/
|
||||
|
||||
static float4 load( const float* const p )
|
||||
{
|
||||
return( _mm_load_ps( p ));
|
||||
}
|
||||
|
||||
/**
|
||||
* @param p Pointer to memory from where the value should be loaded,
|
||||
* may have any alignment.
|
||||
* @return float4 value loaded from the specified memory location.
|
||||
*/
|
||||
|
||||
static float4 loadu( const float* const p )
|
||||
{
|
||||
return( _mm_loadu_ps( p ));
|
||||
}
|
||||
|
||||
/**
|
||||
* @param p Pointer to memory from where the value should be loaded,
|
||||
* may have any alignment.
|
||||
* @param lim The maximum number of elements to load, >0.
|
||||
* @return float4 value loaded from the specified memory location, with
|
||||
* elements beyond "lim" set to 0.
|
||||
*/
|
||||
|
||||
static float4 loadu( const float* const p, int lim )
|
||||
{
|
||||
if( lim > 2 )
|
||||
{
|
||||
if( lim > 3 )
|
||||
{
|
||||
return( _mm_loadu_ps( p ));
|
||||
}
|
||||
else
|
||||
{
|
||||
return( _mm_set_ps( 0.0f, p[ 2 ], p[ 1 ], p[ 0 ]));
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if( lim == 2 )
|
||||
{
|
||||
return( _mm_set_ps( 0.0f, 0.0f, p[ 1 ], p[ 0 ]));
|
||||
}
|
||||
else
|
||||
{
|
||||
return( _mm_load_ss( p ));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Function stores *this value to the specified memory location.
|
||||
*
|
||||
* @param[out] p Output memory location, should be 16-byte aligned.
|
||||
*/
|
||||
|
||||
void store( float* const p ) const
|
||||
{
|
||||
_mm_store_ps( p, value );
|
||||
}
|
||||
|
||||
/**
|
||||
* Function stores *this value to the specified memory location.
|
||||
*
|
||||
* @param[out] p Output memory location, may have any alignment.
|
||||
*/
|
||||
|
||||
void storeu( float* const p ) const
|
||||
{
|
||||
_mm_storeu_ps( p, value );
|
||||
}
|
||||
|
||||
/**
|
||||
* Function stores "lim" lower elements of *this value to the specified
|
||||
* memory location.
|
||||
*
|
||||
* @param[out] p Output memory location, may have any alignment.
|
||||
* @param lim The number of lower elements to store, >0.
|
||||
*/
|
||||
|
||||
void storeu( float* const p, int lim ) const
|
||||
{
|
||||
if( lim > 2 )
|
||||
{
|
||||
if( lim > 3 )
|
||||
{
|
||||
_mm_storeu_ps( p, value );
|
||||
}
|
||||
else
|
||||
{
|
||||
_mm_storel_pi( (__m64*) p, value );
|
||||
_mm_store_ss( p + 2, _mm_movehl_ps( value, value ));
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if( lim == 2 )
|
||||
{
|
||||
_mm_storel_pi( (__m64*) p, value );
|
||||
}
|
||||
else
|
||||
{
|
||||
_mm_store_ss( p, value );
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
float4& operator += ( const float4& s )
|
||||
{
|
||||
value = _mm_add_ps( value, s.value );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float4& operator -= ( const float4& s )
|
||||
{
|
||||
value = _mm_sub_ps( value, s.value );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float4& operator *= ( const float4& s )
|
||||
{
|
||||
value = _mm_mul_ps( value, s.value );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float4& operator /= ( const float4& s )
|
||||
{
|
||||
value = _mm_div_ps( value, s.value );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float4 operator + ( const float4& s ) const
|
||||
{
|
||||
return( _mm_add_ps( value, s.value ));
|
||||
}
|
||||
|
||||
float4 operator - ( const float4& s ) const
|
||||
{
|
||||
return( _mm_sub_ps( value, s.value ));
|
||||
}
|
||||
|
||||
float4 operator * ( const float4& s ) const
|
||||
{
|
||||
return( _mm_mul_ps( value, s.value ));
|
||||
}
|
||||
|
||||
float4 operator / ( const float4& s ) const
|
||||
{
|
||||
return( _mm_div_ps( value, s.value ));
|
||||
}
|
||||
|
||||
/**
|
||||
* @return Horizontal sum of elements.
|
||||
*/
|
||||
|
||||
float hadd() const
|
||||
{
|
||||
const __m128 v = _mm_add_ps( value, _mm_movehl_ps( value, value ));
|
||||
const __m128 res = _mm_add_ss( v, _mm_shuffle_ps( v, v, 1 ));
|
||||
return( _mm_cvtss_f32( res ));
|
||||
}
|
||||
|
||||
/**
|
||||
* Function performs in-place addition of a value located in memory and
|
||||
* the specified value.
|
||||
*
|
||||
* @param p Pointer to value where addition happens. May be unaligned.
|
||||
* @param v Value to add.
|
||||
*/
|
||||
|
||||
static void addu( float* const p, const float4& v )
|
||||
{
|
||||
( loadu( p ) + v ).storeu( p );
|
||||
}
|
||||
|
||||
/**
|
||||
* Function performs in-place addition of a value located in memory and
|
||||
* the specified value. Limited to the specfied number of elements.
|
||||
*
|
||||
* @param p Pointer to value where addition happens. May be unaligned.
|
||||
* @param v Value to add.
|
||||
* @param lim The element number limit, >0.
|
||||
*/
|
||||
|
||||
static void addu( float* const p, const float4& v, const int lim )
|
||||
{
|
||||
( loadu( p, lim ) + v ).storeu( p, lim );
|
||||
}
|
||||
|
||||
__m128 value; ///< Packed value of 4 floats.
|
||||
///<
|
||||
};
|
||||
|
||||
/**
|
||||
* SIMD rounding function, exact result.
|
||||
*
|
||||
* @param v Value to round.
|
||||
* @return Rounded SIMD value.
|
||||
*/
|
||||
|
||||
inline float4 round( const float4& v )
|
||||
{
|
||||
unsigned int prevrm = _MM_GET_ROUNDING_MODE();
|
||||
_MM_SET_ROUNDING_MODE( _MM_ROUND_NEAREST );
|
||||
|
||||
const __m128 res = _mm_cvtepi32_ps( _mm_cvtps_epi32( v.value ));
|
||||
|
||||
_MM_SET_ROUNDING_MODE( prevrm );
|
||||
|
||||
return( res );
|
||||
}
|
||||
|
||||
/**
|
||||
* SIMD function "clamps" (clips) the specified packed values so that they are
|
||||
* not lesser than "minv", and not greater than "maxv".
|
||||
*
|
||||
* @param Value Value to clamp.
|
||||
* @param minv Minimal allowed value.
|
||||
* @param maxv Maximal allowed value.
|
||||
* @return The clamped value.
|
||||
*/
|
||||
|
||||
inline float4 clamp( const float4& Value, const float4& minv,
|
||||
const float4& maxv )
|
||||
{
|
||||
return( _mm_min_ps( _mm_max_ps( Value.value, minv.value ), maxv.value ));
|
||||
}
|
||||
|
||||
typedef fpclass_def< avir :: float4, float > fpclass_float4; ///<
|
||||
///< Class that can be used as the "fpclass" template parameter of the
|
||||
///< avir::CImageResizer class to perform calculation using default
|
||||
///< interleaved algorithm, using SIMD float4 type.
|
||||
///<
|
||||
|
||||
} // namespace avir
|
||||
|
||||
#endif // AVIR_FLOAT4_SSE_INCLUDED
|
||||
365
third_party/avir/avir_float8_avx.h
vendored
Normal file
365
third_party/avir/avir_float8_avx.h
vendored
Normal file
@@ -0,0 +1,365 @@
|
||||
/* clang-format off */
|
||||
//$ nobt
|
||||
//$ nocpp
|
||||
|
||||
/**
|
||||
* @file avir_float8_avx.h
|
||||
*
|
||||
* @brief Inclusion file for the "float8" type.
|
||||
*
|
||||
* This file includes the "float8" AVX-based type used for SIMD variable
|
||||
* storage and processing.
|
||||
*
|
||||
* AVIR Copyright (c) 2015-2019 Aleksey Vaneev
|
||||
*/
|
||||
|
||||
#ifndef AVIR_FLOAT8_AVX_INCLUDED
|
||||
#define AVIR_FLOAT8_AVX_INCLUDED
|
||||
|
||||
#include "libc/bits/mmintrin.h"
|
||||
#include "libc/bits/avxintrin.h"
|
||||
#include "libc/bits/smmintrin.h"
|
||||
#include "libc/bits/pmmintrin.h"
|
||||
#include "libc/bits/avx2intrin.h"
|
||||
#include "libc/bits/xmmintrin.h"
|
||||
#include "third_party/avir/avir_dil.h"
|
||||
|
||||
namespace avir {
|
||||
|
||||
/**
|
||||
* @brief SIMD packed 8-float type.
|
||||
*
|
||||
* This class implements a packed 8-float type that can be used to perform
|
||||
* parallel computation using SIMD instructions on AVX-enabled processors.
|
||||
* This class can be used as the "fptype" argument of the avir::fpclass_def
|
||||
* or avir::fpclass_def_dil class.
|
||||
*/
|
||||
|
||||
class float8
|
||||
{
|
||||
public:
|
||||
float8()
|
||||
{
|
||||
}
|
||||
|
||||
float8( const float8& s )
|
||||
: value( s.value )
|
||||
{
|
||||
}
|
||||
|
||||
float8( const __m256 s )
|
||||
: value( s )
|
||||
{
|
||||
}
|
||||
|
||||
float8( const float s )
|
||||
: value( _mm256_set1_ps( s ))
|
||||
{
|
||||
}
|
||||
|
||||
float8& operator = ( const float8& s )
|
||||
{
|
||||
value = s.value;
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float8& operator = ( const __m256 s )
|
||||
{
|
||||
value = s;
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float8& operator = ( const float s )
|
||||
{
|
||||
value = _mm256_set1_ps( s );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
operator float () const
|
||||
{
|
||||
return( _mm_cvtss_f32( _mm256_extractf128_ps( value, 0 )));
|
||||
}
|
||||
|
||||
/**
|
||||
* @param p Pointer to memory from where the value should be loaded,
|
||||
* should be 32-byte aligned.
|
||||
* @return float8 value loaded from the specified memory location.
|
||||
*/
|
||||
|
||||
static float8 load( const float* const p )
|
||||
{
|
||||
return( _mm256_load_ps( p ));
|
||||
}
|
||||
|
||||
/**
|
||||
* @param p Pointer to memory from where the value should be loaded,
|
||||
* may have any alignment.
|
||||
* @return float8 value loaded from the specified memory location.
|
||||
*/
|
||||
|
||||
static float8 loadu( const float* const p )
|
||||
{
|
||||
return( _mm256_loadu_ps( p ));
|
||||
}
|
||||
|
||||
/**
|
||||
* @param p Pointer to memory from where the value should be loaded,
|
||||
* may have any alignment.
|
||||
* @param lim The maximum number of elements to load, >0.
|
||||
* @return float8 value loaded from the specified memory location, with
|
||||
* elements beyond "lim" set to 0.
|
||||
*/
|
||||
|
||||
static float8 loadu( const float* const p, const int lim )
|
||||
{
|
||||
__m128 lo;
|
||||
__m128 hi;
|
||||
|
||||
if( lim > 4 )
|
||||
{
|
||||
lo = _mm_loadu_ps( p );
|
||||
hi = loadu4( p + 4, lim - 4 );
|
||||
}
|
||||
else
|
||||
{
|
||||
lo = loadu4( p, lim );
|
||||
hi = _mm_setzero_ps();
|
||||
}
|
||||
|
||||
return( _mm256_insertf128_ps( _mm256_castps128_ps256( lo ), hi, 1 ));
|
||||
}
|
||||
|
||||
/**
|
||||
* Function stores *this value to the specified memory location.
|
||||
*
|
||||
* @param[out] p Output memory location, should be 32-byte aligned.
|
||||
*/
|
||||
|
||||
void store( float* const p ) const
|
||||
{
|
||||
_mm256_store_ps( p, value );
|
||||
}
|
||||
|
||||
/**
|
||||
* Function stores *this value to the specified memory location.
|
||||
*
|
||||
* @param[out] p Output memory location, may have any alignment.
|
||||
*/
|
||||
|
||||
void storeu( float* const p ) const
|
||||
{
|
||||
_mm256_storeu_ps( p, value );
|
||||
}
|
||||
|
||||
/**
|
||||
* Function stores "lim" lower elements of *this value to the specified
|
||||
* memory location.
|
||||
*
|
||||
* @param[out] p Output memory location, may have any alignment.
|
||||
* @param lim The number of lower elements to store, >0.
|
||||
*/
|
||||
|
||||
void storeu( float* p, int lim ) const
|
||||
{
|
||||
__m128 v;
|
||||
|
||||
if( lim > 4 )
|
||||
{
|
||||
_mm_storeu_ps( p, _mm256_extractf128_ps( value, 0 ));
|
||||
v = _mm256_extractf128_ps( value, 1 );
|
||||
p += 4;
|
||||
lim -= 4;
|
||||
}
|
||||
else
|
||||
{
|
||||
v = _mm256_extractf128_ps( value, 0 );
|
||||
}
|
||||
|
||||
if( lim > 2 )
|
||||
{
|
||||
if( lim > 3 )
|
||||
{
|
||||
_mm_storeu_ps( p, v );
|
||||
}
|
||||
else
|
||||
{
|
||||
_mm_storel_pi( (__m64*) p, v );
|
||||
_mm_store_ss( p + 2, _mm_movehl_ps( v, v ));
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if( lim == 2 )
|
||||
{
|
||||
_mm_storel_pi( (__m64*) p, v );
|
||||
}
|
||||
else
|
||||
{
|
||||
_mm_store_ss( p, v );
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
float8& operator += ( const float8& s )
|
||||
{
|
||||
value = _mm256_add_ps( value, s.value );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float8& operator -= ( const float8& s )
|
||||
{
|
||||
value = _mm256_sub_ps( value, s.value );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float8& operator *= ( const float8& s )
|
||||
{
|
||||
value = _mm256_mul_ps( value, s.value );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float8& operator /= ( const float8& s )
|
||||
{
|
||||
value = _mm256_div_ps( value, s.value );
|
||||
return( *this );
|
||||
}
|
||||
|
||||
float8 operator + ( const float8& s ) const
|
||||
{
|
||||
return( _mm256_add_ps( value, s.value ));
|
||||
}
|
||||
|
||||
float8 operator - ( const float8& s ) const
|
||||
{
|
||||
return( _mm256_sub_ps( value, s.value ));
|
||||
}
|
||||
|
||||
float8 operator * ( const float8& s ) const
|
||||
{
|
||||
return( _mm256_mul_ps( value, s.value ));
|
||||
}
|
||||
|
||||
float8 operator / ( const float8& s ) const
|
||||
{
|
||||
return( _mm256_div_ps( value, s.value ));
|
||||
}
|
||||
|
||||
/**
|
||||
* @return Horizontal sum of elements.
|
||||
*/
|
||||
|
||||
float hadd() const
|
||||
{
|
||||
__m128 v = _mm_add_ps( _mm256_extractf128_ps( value, 0 ),
|
||||
_mm256_extractf128_ps( value, 1 ));
|
||||
|
||||
v = _mm_hadd_ps( v, v );
|
||||
v = _mm_hadd_ps( v, v );
|
||||
return( _mm_cvtss_f32( v ));
|
||||
}
|
||||
|
||||
/**
|
||||
* Function performs in-place addition of a value located in memory and
|
||||
* the specified value.
|
||||
*
|
||||
* @param p Pointer to value where addition happens. May be unaligned.
|
||||
* @param v Value to add.
|
||||
*/
|
||||
|
||||
static void addu( float* const p, const float8& v )
|
||||
{
|
||||
( loadu( p ) + v ).storeu( p );
|
||||
}
|
||||
|
||||
/**
|
||||
* Function performs in-place addition of a value located in memory and
|
||||
* the specified value. Limited to the specfied number of elements.
|
||||
*
|
||||
* @param p Pointer to value where addition happens. May be unaligned.
|
||||
* @param v Value to add.
|
||||
* @param lim The element number limit, >0.
|
||||
*/
|
||||
|
||||
static void addu( float* const p, const float8& v, const int lim )
|
||||
{
|
||||
( loadu( p, lim ) + v ).storeu( p, lim );
|
||||
}
|
||||
|
||||
__m256 value; ///< Packed value of 8 floats.
|
||||
///<
|
||||
|
||||
private:
|
||||
/**
|
||||
* @param p Pointer to memory from where the value should be loaded,
|
||||
* may have any alignment.
|
||||
* @param lim The maximum number of elements to load, >0.
|
||||
* @return __m128 value loaded from the specified memory location, with
|
||||
* elements beyond "lim" set to 0.
|
||||
*/
|
||||
|
||||
static __m128 loadu4( const float* const p, const int lim )
|
||||
{
|
||||
if( lim > 2 )
|
||||
{
|
||||
if( lim > 3 )
|
||||
{
|
||||
return( _mm_loadu_ps( p ));
|
||||
}
|
||||
else
|
||||
{
|
||||
return( _mm_set_ps( 0.0f, p[ 2 ], p[ 1 ], p[ 0 ]));
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if( lim == 2 )
|
||||
{
|
||||
return( _mm_set_ps( 0.0f, 0.0f, p[ 1 ], p[ 0 ]));
|
||||
}
|
||||
else
|
||||
{
|
||||
return( _mm_load_ss( p ));
|
||||
}
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
/**
|
||||
* SIMD rounding function, exact result.
|
||||
*
|
||||
* @param v Value to round.
|
||||
* @return Rounded SIMD value.
|
||||
*/
|
||||
|
||||
inline float8 round( const float8& v )
|
||||
{
|
||||
return( _mm256_round_ps( v.value,
|
||||
( _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC )));
|
||||
}
|
||||
|
||||
/**
|
||||
* SIMD function "clamps" (clips) the specified packed values so that they are
|
||||
* not lesser than "minv", and not greater than "maxv".
|
||||
*
|
||||
* @param Value Value to clamp.
|
||||
* @param minv Minimal allowed value.
|
||||
* @param maxv Maximal allowed value.
|
||||
* @return The clamped value.
|
||||
*/
|
||||
|
||||
inline float8 clamp( const float8& Value, const float8& minv,
|
||||
const float8& maxv )
|
||||
{
|
||||
return( _mm256_min_ps( _mm256_max_ps( Value.value, minv.value ),
|
||||
maxv.value ));
|
||||
}
|
||||
|
||||
typedef fpclass_def_dil< float, avir :: float8 > fpclass_float8_dil; ///<
|
||||
///< Class that can be used as the "fpclass" template parameter of the
|
||||
///< avir::CImageResizer class to perform calculation using
|
||||
///< de-interleaved SIMD algorithm, using SIMD float8 type.
|
||||
///<
|
||||
|
||||
} // namespace avir
|
||||
|
||||
#endif // AVIR_FLOAT8_AVX_INCLUDED
|
||||
1494
third_party/avir/lancir.h
vendored
Normal file
1494
third_party/avir/lancir.h
vendored
Normal file
File diff suppressed because it is too large
Load Diff
40
third_party/avir/lanczos.cc
vendored
Normal file
40
third_party/avir/lanczos.cc
vendored
Normal file
@@ -0,0 +1,40 @@
|
||||
/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
|
||||
│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8 :vi│
|
||||
╞══════════════════════════════════════════════════════════════════════════════╡
|
||||
│ Copyright 2020 Justine Alexandra Roberts Tunney │
|
||||
│ │
|
||||
│ This program is free software; you can redistribute it and/or modify │
|
||||
│ it under the terms of the GNU General Public License as published by │
|
||||
│ the Free Software Foundation; version 2 of the License. │
|
||||
│ │
|
||||
│ This program is distributed in the hope that it will be useful, but │
|
||||
│ WITHOUT ANY WARRANTY; without even the implied warranty of │
|
||||
│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU │
|
||||
│ General Public License for more details. │
|
||||
│ │
|
||||
│ You should have received a copy of the GNU General Public License │
|
||||
│ along with this program; if not, write to the Free Software │
|
||||
│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA │
|
||||
│ 02110-1301 USA │
|
||||
╚─────────────────────────────────────────────────────────────────────────────*/
|
||||
#include "libc/limits.h"
|
||||
#include "libc/log/check.h"
|
||||
#include "libc/log/log.h"
|
||||
#include "third_party/avir/lanczos.h"
|
||||
namespace {
|
||||
#include "third_party/avir/lancir.h"
|
||||
} // namespace
|
||||
|
||||
/**
|
||||
* Does Lanczos interpolation.
|
||||
* @note computers w/o AVX2+FMA need to call BilinearScale()
|
||||
*/
|
||||
void lanczos(unsigned dyn, unsigned dxn, void *restrict dst, unsigned syn,
|
||||
unsigned sxn, const void *restrict src, unsigned sw) {
|
||||
avir::CLancIR lanczos;
|
||||
DCHECK_ALIGNED(64, dst);
|
||||
DCHECK_ALIGNED(64, src);
|
||||
LOGF("%10s%5zux×%-5zu→%5zu×%-5zu", "lanczos", sxn, syn, dxn, dyn);
|
||||
lanczos.resizeImage((const float *)src, sxn, syn, sw, (float *)dst, dxn, dyn,
|
||||
4);
|
||||
}
|
||||
13
third_party/avir/lanczos.h
vendored
Normal file
13
third_party/avir/lanczos.h
vendored
Normal file
@@ -0,0 +1,13 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS_H_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS_H_
|
||||
#if !(__ASSEMBLER__ + __LINKER__ + 0)
|
||||
COSMOPOLITAN_C_START_
|
||||
|
||||
void lanczos(unsigned, unsigned, void *, unsigned, unsigned, const void *,
|
||||
unsigned);
|
||||
void lanczos3(unsigned, unsigned, void *, unsigned, unsigned, const void *,
|
||||
unsigned);
|
||||
|
||||
COSMOPOLITAN_C_END_
|
||||
#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS_H_ */
|
||||
77
third_party/avir/lanczos1.cc
vendored
Normal file
77
third_party/avir/lanczos1.cc
vendored
Normal file
@@ -0,0 +1,77 @@
|
||||
/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
|
||||
│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8 :vi│
|
||||
╞══════════════════════════════════════════════════════════════════════════════╡
|
||||
│ Copyright 2020 Justine Alexandra Roberts Tunney │
|
||||
│ │
|
||||
│ This program is free software; you can redistribute it and/or modify │
|
||||
│ it under the terms of the GNU General Public License as published by │
|
||||
│ the Free Software Foundation; version 2 of the License. │
|
||||
│ │
|
||||
│ This program is distributed in the hope that it will be useful, but │
|
||||
│ WITHOUT ANY WARRANTY; without even the implied warranty of │
|
||||
│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU │
|
||||
│ General Public License for more details. │
|
||||
│ │
|
||||
│ You should have received a copy of the GNU General Public License │
|
||||
│ along with this program; if not, write to the Free Software │
|
||||
│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA │
|
||||
│ 02110-1301 USA │
|
||||
╚─────────────────────────────────────────────────────────────────────────────*/
|
||||
#include "libc/bits/xmmintrin.h"
|
||||
#include "libc/limits.h"
|
||||
#include "libc/log/log.h"
|
||||
#include "libc/runtime/runtime.h"
|
||||
#include "third_party/avir/lanczos1.h"
|
||||
namespace {
|
||||
#include "third_party/avir/lanczos1.hpp"
|
||||
} // namespace
|
||||
|
||||
void lanczos1init(struct lanczos1 *resizer) {
|
||||
lanczos1free(resizer);
|
||||
resizer->p = new Lanczos1Impl;
|
||||
}
|
||||
|
||||
void lanczos1free(struct lanczos1 *resizer) {
|
||||
Lanczos1Impl *impl;
|
||||
if (!resizer->p) return;
|
||||
impl = (Lanczos1Impl *)resizer->p;
|
||||
delete impl;
|
||||
resizer->p = nullptr;
|
||||
}
|
||||
|
||||
/**
|
||||
* Resizes image plane w/ Lanczos interpolation, e.g.
|
||||
*
|
||||
* struct lanczos1 scaler = {0};
|
||||
* lanczos1init(&scaler);
|
||||
* lanczos1(&scaler, ...);
|
||||
* lanczos1free(&scaler);
|
||||
*
|
||||
* @param dyn is destination height
|
||||
* @param dxn is destination width
|
||||
* @param dst is destination unsigned char array
|
||||
* @param dstsize is number of bytes in dst
|
||||
* @param syn is source height
|
||||
* @param sxn is source width
|
||||
* @param ssw is number of unsigned chars per scanline in src
|
||||
* @param src is source unsigned char array
|
||||
* @param srcsize is number of bytes in src
|
||||
*/
|
||||
void lanczos1(struct lanczos1 *resizer, size_t dyn, size_t dxn, void *dst,
|
||||
size_t dstsize, size_t syn, size_t sxn, size_t ssw,
|
||||
const void *src, size_t srcsize) {
|
||||
Lanczos1Impl *impl;
|
||||
unsigned int roundhouse;
|
||||
LOGF("%10s%5zux×%-5zu→%5zu×%-5zu", "lanczos1", sxn, syn, dxn, dyn);
|
||||
CHECK_LE(dstsize, INT_MAX);
|
||||
CHECK_LE(srcsize, INT_MAX);
|
||||
CHECK_LE(sizeof(unsigned char) * 1 * dyn * dxn, dstsize);
|
||||
CHECK_LE(sizeof(unsigned char) * 1 * syn * sxn, srcsize);
|
||||
CHECK_LE(sizeof(unsigned char) * syn * ssw, srcsize);
|
||||
roundhouse = _MM_GET_ROUNDING_MODE();
|
||||
_MM_SET_ROUNDING_MODE(_MM_ROUND_NEAREST);
|
||||
impl = (Lanczos1Impl *)resizer->p;
|
||||
impl->lanczos.resizeImage((const unsigned char *)src, sxn, syn, ssw,
|
||||
(unsigned char *)dst, dxn, dyn, 1);
|
||||
_MM_SET_ROUNDING_MODE(roundhouse);
|
||||
}
|
||||
18
third_party/avir/lanczos1.h
vendored
Normal file
18
third_party/avir/lanczos1.h
vendored
Normal file
@@ -0,0 +1,18 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_H_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_H_
|
||||
#if !(__ASSEMBLER__ + __LINKER__ + 0)
|
||||
COSMOPOLITAN_C_START_
|
||||
|
||||
struct lanczos1 {
|
||||
void *p;
|
||||
};
|
||||
|
||||
void lanczos1init(struct lanczos1 *self);
|
||||
void lanczos1free(struct lanczos1 *self);
|
||||
void lanczos1(struct lanczos1 *self, size_t dyn, size_t dxn, void *dst,
|
||||
size_t dstsize, size_t syn, size_t sxn, size_t ssw,
|
||||
const void *src, size_t srcsize) paramsnonnull();
|
||||
|
||||
COSMOPOLITAN_C_END_
|
||||
#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_H_ */
|
||||
11
third_party/avir/lanczos1.hpp
vendored
Normal file
11
third_party/avir/lanczos1.hpp
vendored
Normal file
@@ -0,0 +1,11 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_HPP_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_HPP_
|
||||
#include "third_party/avir/lancir.h"
|
||||
|
||||
struct Lanczos1Impl {
|
||||
Lanczos1Impl() : lanczos{} {
|
||||
}
|
||||
avir::CLancIR lanczos;
|
||||
};
|
||||
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1_HPP_ */
|
||||
31
third_party/avir/lanczos1b.cc
vendored
Normal file
31
third_party/avir/lanczos1b.cc
vendored
Normal file
@@ -0,0 +1,31 @@
|
||||
/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
|
||||
│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8 :vi│
|
||||
╞══════════════════════════════════════════════════════════════════════════════╡
|
||||
│ Copyright 2020 Justine Alexandra Roberts Tunney │
|
||||
│ │
|
||||
│ This program is free software; you can redistribute it and/or modify │
|
||||
│ it under the terms of the GNU General Public License as published by │
|
||||
│ the Free Software Foundation; version 2 of the License. │
|
||||
│ │
|
||||
│ This program is distributed in the hope that it will be useful, but │
|
||||
│ WITHOUT ANY WARRANTY; without even the implied warranty of │
|
||||
│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU │
|
||||
│ General Public License for more details. │
|
||||
│ │
|
||||
│ You should have received a copy of the GNU General Public License │
|
||||
│ along with this program; if not, write to the Free Software │
|
||||
│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA │
|
||||
│ 02110-1301 USA │
|
||||
╚─────────────────────────────────────────────────────────────────────────────*/
|
||||
#include "libc/bits/bits.h"
|
||||
#include "third_party/avir/lanczos1b.h"
|
||||
namespace {
|
||||
#include "third_party/avir/lancir.h"
|
||||
} // namespace
|
||||
|
||||
void lanczos1b(size_t dyn, size_t dxn, unsigned char *restrict dst, size_t syn,
|
||||
size_t sxn, const unsigned char *restrict src) {
|
||||
avir::CLancIR lanczos;
|
||||
LOGF("%10s%5zux×%-5zu→%5zu×%-5zu", "lanczos1b", sxn, syn, dxn, dyn);
|
||||
lanczos.resizeImage(src, sxn, syn, roundup2pow(sxn) * 4, dst, dxn, dyn, 4);
|
||||
}
|
||||
11
third_party/avir/lanczos1b.h
vendored
Normal file
11
third_party/avir/lanczos1b.h
vendored
Normal file
@@ -0,0 +1,11 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1B_H_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1B_H_
|
||||
#if !(__ASSEMBLER__ + __LINKER__ + 0)
|
||||
COSMOPOLITAN_C_START_
|
||||
|
||||
void lanczos1b(size_t dyn, size_t dxn, unsigned char *restrict dst, size_t syn,
|
||||
size_t sxn, const unsigned char *restrict src);
|
||||
|
||||
COSMOPOLITAN_C_END_
|
||||
#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1B_H_ */
|
||||
63
third_party/avir/lanczos1f.cc
vendored
Normal file
63
third_party/avir/lanczos1f.cc
vendored
Normal file
@@ -0,0 +1,63 @@
|
||||
/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
|
||||
│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8 :vi│
|
||||
╞══════════════════════════════════════════════════════════════════════════════╡
|
||||
│ Copyright 2020 Justine Alexandra Roberts Tunney │
|
||||
│ │
|
||||
│ This program is free software; you can redistribute it and/or modify │
|
||||
│ it under the terms of the GNU General Public License as published by │
|
||||
│ the Free Software Foundation; version 2 of the License. │
|
||||
│ │
|
||||
│ This program is distributed in the hope that it will be useful, but │
|
||||
│ WITHOUT ANY WARRANTY; without even the implied warranty of │
|
||||
│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU │
|
||||
│ General Public License for more details. │
|
||||
│ │
|
||||
│ You should have received a copy of the GNU General Public License │
|
||||
│ along with this program; if not, write to the Free Software │
|
||||
│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA │
|
||||
│ 02110-1301 USA │
|
||||
╚─────────────────────────────────────────────────────────────────────────────*/
|
||||
#include "libc/bits/xmmintrin.h"
|
||||
#include "libc/runtime/runtime.h"
|
||||
#include "third_party/avir/lanczos1f.h"
|
||||
namespace {
|
||||
#include "third_party/avir/lanczos1f.hpp"
|
||||
} // namespace
|
||||
|
||||
void lanczos1finit(struct lanczos1f *resizer) {
|
||||
lanczos1ffree(resizer);
|
||||
resizer->p = new Lanczos1fImpl;
|
||||
}
|
||||
|
||||
void lanczos1ffree(struct lanczos1f *resizer) {
|
||||
Lanczos1fImpl *impl;
|
||||
if (!resizer->p) return;
|
||||
impl = (Lanczos1fImpl *)resizer->p;
|
||||
delete impl;
|
||||
resizer->p = nullptr;
|
||||
}
|
||||
|
||||
/**
|
||||
* Resizes image plane w/ Lanczos interpolation, e.g.
|
||||
*
|
||||
* struct lanczos1f scaler = {0};
|
||||
* lanczos1finit(&scaler);
|
||||
* lanczos1f(&scaler, ...);
|
||||
* lanczos1ffree(&scaler);
|
||||
*
|
||||
* @param dyn is destination height
|
||||
* @param dxn is destination width
|
||||
* @param dst is destination unsigned char array
|
||||
* @param syn is source height
|
||||
* @param sxn is source width
|
||||
* @param ssw is number of unsigned chars per scanline in src
|
||||
* @param src is source unsigned char array
|
||||
*/
|
||||
void lanczos1f(struct lanczos1f *resizer, size_t dyn, size_t dxn, void *dst,
|
||||
size_t syn, size_t sxn, size_t ssw, const void *src, double ky0,
|
||||
double kx0, double oy, double ox) {
|
||||
Lanczos1fImpl *impl;
|
||||
impl = (Lanczos1fImpl *)resizer->p;
|
||||
impl->lanczos.resizeImage((const float *)src, sxn, syn, ssw, (float *)dst,
|
||||
dxn, dyn, 1, kx0, ky0, ox, oy);
|
||||
}
|
||||
18
third_party/avir/lanczos1f.h
vendored
Normal file
18
third_party/avir/lanczos1f.h
vendored
Normal file
@@ -0,0 +1,18 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_H_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_H_
|
||||
#if !(__ASSEMBLER__ + __LINKER__ + 0)
|
||||
COSMOPOLITAN_C_START_
|
||||
|
||||
struct lanczos1f {
|
||||
void *p;
|
||||
};
|
||||
|
||||
void lanczos1finit(struct lanczos1f *);
|
||||
void lanczos1ffree(struct lanczos1f *);
|
||||
void lanczos1f(struct lanczos1f *, size_t, size_t, void *, size_t, size_t,
|
||||
size_t, const void *, double, double, double, double)
|
||||
paramsnonnull();
|
||||
|
||||
COSMOPOLITAN_C_END_
|
||||
#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_H_ */
|
||||
11
third_party/avir/lanczos1f.hpp
vendored
Normal file
11
third_party/avir/lanczos1f.hpp
vendored
Normal file
@@ -0,0 +1,11 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_HPP_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_HPP_
|
||||
#include "third_party/avir/lancir.h"
|
||||
|
||||
struct Lanczos1fImpl {
|
||||
Lanczos1fImpl() : lanczos{} {
|
||||
}
|
||||
avir::CLancIR lanczos;
|
||||
};
|
||||
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_LANCZOS1F_HPP_ */
|
||||
30
third_party/avir/lanczos3.cc
vendored
Normal file
30
third_party/avir/lanczos3.cc
vendored
Normal file
@@ -0,0 +1,30 @@
|
||||
/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
|
||||
│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8 :vi│
|
||||
╞══════════════════════════════════════════════════════════════════════════════╡
|
||||
│ Copyright 2020 Justine Alexandra Roberts Tunney │
|
||||
│ │
|
||||
│ This program is free software; you can redistribute it and/or modify │
|
||||
│ it under the terms of the GNU General Public License as published by │
|
||||
│ the Free Software Foundation; version 2 of the License. │
|
||||
│ │
|
||||
│ This program is distributed in the hope that it will be useful, but │
|
||||
│ WITHOUT ANY WARRANTY; without even the implied warranty of │
|
||||
│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU │
|
||||
│ General Public License for more details. │
|
||||
│ │
|
||||
│ You should have received a copy of the GNU General Public License │
|
||||
│ along with this program; if not, write to the Free Software │
|
||||
│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA │
|
||||
│ 02110-1301 USA │
|
||||
╚─────────────────────────────────────────────────────────────────────────────*/
|
||||
#include "third_party/avir/lanczos.h"
|
||||
namespace {
|
||||
#include "third_party/avir/lancir.h"
|
||||
}
|
||||
|
||||
void lanczos3(unsigned dyn, unsigned dxn, void *dst, unsigned syn, unsigned sxn,
|
||||
const void *src, unsigned sw) {
|
||||
avir::CLancIR lanczos;
|
||||
lanczos.resizeImage((const float *)src, sxn, syn, sw, (float *)dst, dxn, dyn,
|
||||
3, -1, -2);
|
||||
}
|
||||
11
third_party/avir/notice.h
vendored
Normal file
11
third_party/avir/notice.h
vendored
Normal file
@@ -0,0 +1,11 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_NOTICE_H_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_NOTICE_H_
|
||||
#if !(__ASSEMBLER__ + __LINKER__ + 0)
|
||||
|
||||
asm(".ident\t\"\\n\\n\
|
||||
AVIR (MIT License)\\n\
|
||||
Copyright 2015-2019 Aleksey Vaneev\"");
|
||||
asm(".include \"libc/disclaimer.inc\"");
|
||||
|
||||
#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_NOTICE_H_ */
|
||||
48
third_party/avir/resize.cc
vendored
Normal file
48
third_party/avir/resize.cc
vendored
Normal file
@@ -0,0 +1,48 @@
|
||||
/*-*-mode:c++;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8-*-│
|
||||
│vi: set net ft=c++ ts=2 sts=2 sw=2 fenc=utf-8 :vi│
|
||||
╞══════════════════════════════════════════════════════════════════════════════╡
|
||||
│ Copyright 2020 Justine Alexandra Roberts Tunney │
|
||||
│ │
|
||||
│ This program is free software; you can redistribute it and/or modify │
|
||||
│ it under the terms of the GNU General Public License as published by │
|
||||
│ the Free Software Foundation; version 2 of the License. │
|
||||
│ │
|
||||
│ This program is distributed in the hope that it will be useful, but │
|
||||
│ WITHOUT ANY WARRANTY; without even the implied warranty of │
|
||||
│ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU │
|
||||
│ General Public License for more details. │
|
||||
│ │
|
||||
│ You should have received a copy of the GNU General Public License │
|
||||
│ along with this program; if not, write to the Free Software │
|
||||
│ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA │
|
||||
│ 02110-1301 USA │
|
||||
╚─────────────────────────────────────────────────────────────────────────────*/
|
||||
#include "third_party/avir/resize.h"
|
||||
namespace {
|
||||
#include "third_party/avir/avir_float4_sse.h"
|
||||
} // namespace
|
||||
|
||||
struct ResizerImpl {
|
||||
ResizerImpl() : resizer{8, 8, avir::CImageResizerParamsULR()} {}
|
||||
avir::CImageResizer<avir::fpclass_float4> resizer;
|
||||
};
|
||||
|
||||
void NewResizer(Resizer *resizer, int aResBitDepth, int aSrcBitDepth) {
|
||||
FreeResizer(resizer);
|
||||
resizer->p = new ResizerImpl();
|
||||
}
|
||||
|
||||
void FreeResizer(Resizer *resizer) {
|
||||
if (!resizer->p) return;
|
||||
delete (ResizerImpl *)resizer->p;
|
||||
resizer->p = nullptr;
|
||||
}
|
||||
|
||||
void ResizeImage(Resizer *resizer, float *Dest, int DestHeight, int DestWidth,
|
||||
const float *Src, int SrcHeight, int SrcWidth) {
|
||||
ResizerImpl *impl = (ResizerImpl *)resizer->p;
|
||||
int SrcScanLineSize = 0;
|
||||
double ResizingStep = 0;
|
||||
impl->resizer.resizeImage(Src, SrcWidth, SrcHeight, SrcScanLineSize, Dest,
|
||||
DestWidth, DestHeight, 4, ResizingStep);
|
||||
}
|
||||
17
third_party/avir/resize.h
vendored
Normal file
17
third_party/avir/resize.h
vendored
Normal file
@@ -0,0 +1,17 @@
|
||||
#ifndef COSMOPOLITAN_THIRD_PARTY_AVIR_RESIZE_H_
|
||||
#define COSMOPOLITAN_THIRD_PARTY_AVIR_RESIZE_H_
|
||||
#if !(__ASSEMBLER__ + __LINKER__ + 0)
|
||||
COSMOPOLITAN_C_START_
|
||||
|
||||
struct Resizer {
|
||||
void *p;
|
||||
};
|
||||
|
||||
void FreeResizer(struct Resizer *) paramsnonnull();
|
||||
void NewResizer(struct Resizer *, int, int) paramsnonnull();
|
||||
void ResizeImage(struct Resizer *, float *, int, int, const float *, int, int)
|
||||
paramsnonnull();
|
||||
|
||||
COSMOPOLITAN_C_END_
|
||||
#endif /* !(__ASSEMBLER__ + __LINKER__ + 0) */
|
||||
#endif /* COSMOPOLITAN_THIRD_PARTY_AVIR_RESIZE_H_ */
|
||||
Reference in New Issue
Block a user