Hi there 👋 | Oriol Alàs Cercós

Hi, I’m Oriol Alàs Cercós, an AI Engineer and Researcher 🔬 with a passion for Machine Learning, Computer Vision, and Large Language Models (LLMs).

This blog is a personal notebook where I document projects, experiments, and ideas related to artificial intelligence, machine learning, and applied research.

InnWater – Water Tariff Dashboard

AI-augmented economic simulation platform for sustainable and equitable water tariff design within the WEFE nexus.

Python FastAPI Angular

InnWater – AI-Augmented Water Governance Platform

AI-powered decision-support system for water governance assessment and policy optimization within the WEFE nexus.

Python FastAPI Angular

QR Nativity Challenge

Gamified QR-based Christmas challenge for local commerce.

Angular Django REST API Google OAuth

View all projects →

Alàs, O., Cervera, R., & Sanfeliu, R. (2026). A Scalable Real-Time Multi-Camera Vehicle Tracking System for Urban Environments. IEEE Access

[Read Publication ↗]

Velasquez-Camacho, L., et al. (2025). Monitoring temporal changes in large urban street trees using remote sensing and deep learning. PLOS ONE.

[Read Publication ↗]

Alàs, O., & Sebé, F. (2024). Privacy-Preserving Electricity Trading for Connected Microgrids. Applied Sciences.

[Read Publication ↗]

View all publications →

Latest Posts

Reviewing YOLO: You Only Look Once

Object detection is one of the most popular tasks in computer vision, since it can be applied to a wide range of applications: robotics, autonomous driving or fault detection. In this post, we will try to give a brief overview of the YOLO algorithm and the components that make it work. To do that, I have classified the main components of the algorithm into three categories: Characteristics based on the model architecture, how YOLO-based models improved the performance by using a new architecture and which are the improvements made. Strategies based on the model training, such as the function loss or data augmentation. Methods for post-processing the output of the model, such as the non-maximum suppression (NMS) and the confidence threshold. The origin of YOLO ...

Loss functions and their final-layer activations

When making the first steps with deep learning, we grasp the idea of using a neural network to learn a function that maps data to other data. We are often told that neural networks are a powerful tool in machine learning because of their non-linearity and their ability to learn complex functions from data, which results in minizing some loss function. In this post, we will explore how the final-layer activations are dependent on the loss function of our problem. ...

deep-learning machine-learning

Variational AutoEncoders (VAE) for Tabular Data

The post of today is going to be a bit different. We have already talked about Variational Autoencoders (VAE) in the past, but today we are going to see how to implement it from scratch, train it on a dataset and see how it behaves with tabular data. Yes, VAEs can be used for tabular data as well. To do so, we will use the CRISP-DM framework to guide us through the process. ...

project vae auto-encoders variational-auto-encoders synthetic-data tabular-data embedding-models deep-learning

From Words to Vectors: A Dive into Embedding Model Taxonomy

Embedding models are foundational in modern NLP, turning raw text into numerical vectors that preserve semantic significance. These representations power everything from semantic search to Retrieval-Augmented Generation or Prompt Engineering for LLM Agents. With growing demand for domain-specific applications, understanding which is the best fit for your system is more important than ever. Introduction In modern NLP, a text embedding is a vector that represents a piece of text in a mathematical space. The magic of embeddings is that they encode semantic meaning: texts with similar meaning end up with vectors that are close together. For example, an embedding model might place “How to change a tier” near “Steps to fix a flat tire” in its vector space, even though the wording is different. This property makes embedding models incredibly useful for tasks like search, clustering or recommendation, where we care about semantic similarity rather than exact keyword matches. By converting text into vectors, embedding models allow computers to measure meaning and relevance via distances in vector space. ...

nlp introduction transformers embedding-models deep-learning

The Generative Trilemma: A quick overview

Generative models are a class of machine learning that learn a representation of the data trained on and they model the data itself. Ideally, generative models should satisfy the following key requirements in a real environment: High quality samples refers to those samples that captures the underlying patterns and structures present in the data making them indistinguishable from human observers. Fast Sampling is about the efficiency of image generation and the computational overhead that can cause generative models. Mode Coverage/Diversity points out how the model is able to generate a full range of mods and diverse patterns present in the training data Fig. 1. The Generative Learning Trilemma ...

computer-vision deep-learning introduction generative-adversarial-networks auto-encoders variational-auto-encoders denoising-diffusion-models

Introduction to Attention Mechanism and Transformers

Transformers have demonstrated excellent capabilities and they overcome challenges such NLP, Text-To-Image Generation or Image Completion with large datasets, great model size and enough compute. Talking about transformers nowadays is as casual as talking about CNNs, MLPs or Linear Regressions. Why not take a glance through this state-of-the-art architecture? In this post, we’ll introduce the Sequence-to-Sequence (Seq2Seq) paradigm, explore the attention mechanism, and provide a detailed, step-by-step explanation of the components that make up transformer architectures. ...

nlp introduction transformers seq2seq rnn deep-learning

Thresholding, filtering and morphological operations

Traditional computer vision techniques involve methods and algorithms that do not rely on deep learning or neural networks. Instead, these approaches are not data-driven and they use classical approaches to process and analyze images. So, in this post, we’ll explore three thresholding techniques! Thresholding When the task is to distinguish the background from the foreground, thresholding provides a straightforward solution. We will use this image as an example. ...

computer-vision traditional-computer-vision

Hi there 👋 | Oriol Alàs Cercós

Latest Projects

Selected Publications

Latest Posts

Reviewing YOLO: You Only Look Once

Loss functions and their final-layer activations

Variational AutoEncoders (VAE) for Tabular Data

From Words to Vectors: A Dive into Embedding Model Taxonomy

The Generative Trilemma: A quick overview

Introduction to Attention Mechanism and Transformers

Thresholding, filtering and morphological operations