4,991+ open-access research outputs.
We present fast-vollib, an open-source Python library that provides high-performance European option pricing, implied volatility (IV) computation, and Greeks under the Black-76, Black-Scholes, and Blaโฆ
For NVIDIA GPUs, CUDA is the primary interface through which applications orchestrate GPU execution, yet much of the logic that realizes CUDA operations resides in NVIDIA's closed-source userspace driโฆ
3D Gaussian Splatting (3DGS) achieves high-quality novel view synthesis with real-time rendering, but its storage cost remains prohibitive for practical deployment. Existing post-training compression โฆ
We present the SPHEREx Ultracool Dwarf spectral Atlas (SUDA), a homogeneous sample of 1675 ultracool dwarfs with continuous 0.75--5 $\mu$m spectroscopy from SPHEREx QR2. Using the SAND and ATMO2020++ โฆ
Deep learning compilers and vendor libraries deliver strong baseline performance but are bounded by finite, engineer-curated catalogs. When these omit needed optimizations, practitioners substitute haโฆ
We introduce $\texttt{cuSkyrmion}$, a 3-dimensional Skyrme model computation and visualization software, that is written in CUDA C for rapid computation and visualization of especially the arrested Neโฆ
Recent image editing models have achieved strong visual fidelity but often struggle with tasks requiring complex reasoning. To investigate and enhance the reasoning-grounded planning for image editingโฆ
Efficient GPU execution of convolution operators is governed by memory-access efficiency, on-chip data reuse, and execution mapping rather than arithmetic throughput alone. This paper presents a contrโฆ
We show that the loop homology algebras of polyhedral products of the form $(\underline{X},\underline{*})^{\mathcal{K}}$ can be written as a colimit over the flagification of $\mathcal{K}$, and obtainโฆ
3D point cloud perception remains tightly coupled to custom CUDA operators for spatial operations, limiting portability and efficiency on non-NVIDIA, AMD, and embedded hardware. We introduce PointTranโฆ
Fully decentralized Muon is difficult because its nonlinear matrix-sign operator does not commute with linear gossip averaging. This makes decentralized Muon a structural design problem: in designing โฆ
We construct a family of velocity fields demonstrating the sharpness of the classical Zvonkin--Veretennikov--Davie strong well-posedness by noise regime. We consider stochastic differential equations โฆ
Classical multivariate statistical methods such as covariance estimation and principal component analysis are well understood mathematically, yet their application at extreme data scales remains challโฆ
Existing attention accelerators often trade exact softmax semantics, depend on fused Tensor Core kernels, or incur sequential depth that limits FP32 throughput on long sequences. We present \textbf{ELโฆ
Large language model-powered sequential recommender systems (LLM-SRSs) have recently demonstrated remarkable performance, enabling recommendations through prompt-driven inference over user interactionโฆ
Large language model (LLM) decoding is latency-sensitive and often bottlenecked by fragmented operator execution and repeated off-chip materialization of intermediate tensors. Prior work expands fusioโฆ
Large Language Models (LLMs) have achieved strong performance across natural language and multimodal tasks, yet their practical deployment remains constrained by inference latency and kernel launch ovโฆ
NVIDIA's CUDA Tile (CuTile) introduces a Python-based, tile-centric abstraction for GPU kernel development that aims to simplify programming while retaining Tensor Core and Tensor Memory Accelerator (โฆ
Artificial Intelligence (AI) has become a powerful tool for model-free Radio Access Network (RAN) signal processing and optimization. However, designing a single model that generalizes across all radiโฆ
Submillimeter (submm) integral field units (IFUs) utilising kinetic inductance detectors (KIDs) are a promising instrument architecture for the study of galaxies, galaxy clusters, and the large-scale โฆ
Free open-access publishing with Google Scholar indexing.
Submission Guide โ