ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.01860
  4. Cited By
Encoding high-cardinality string categorical variables
v1v2v3v4v5 (latest)

Encoding high-cardinality string categorical variables

3 July 2019
Patricio Cerda
Gaël Varoquaux
ArXiv (abs)PDFHTML

Papers citing "Encoding high-cardinality string categorical variables"

18 / 18 papers shown
Title
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations
Alan Arazi
Eilam Shapira
Roi Reichart
LMTD
205
0
0
23 May 2025
CLAMS: A System for Zero-Shot Model Selection for Clustering
CLAMS: A System for Zero-Shot Model Selection for Clustering
Prabhant Singh
Pieter Gijsbers
Murat Onur Yildirim
Elif Ceren Gok
Joaquin Vanschoren
82
0
0
15 Jul 2024
CAVIAR: Categorical-Variable Embeddings for Accurate and Robust
  Inference
CAVIAR: Categorical-Variable Embeddings for Accurate and Robust Inference
Anirban Mukherjee
H. Chang
61
0
0
07 Apr 2024
Automated data processing and feature engineering for deep learning and
  big data applications: a survey
Automated data processing and feature engineering for deep learning and big data applications: a survey
A. Mumuni
F. Mumuni
TPM
84
60
0
18 Mar 2024
CARTE: Pretraining and Transfer for Tabular Learning
CARTE: Pretraining and Transfer for Tabular Learning
Myung Jun Kim
Léo Grinsztajn
Gaël Varoquaux
LMTD
147
23
0
26 Feb 2024
Comparative Study on the Performance of Categorical Variable Encoders in
  Classification and Regression Tasks
Comparative Study on the Performance of Categorical Variable Encoders in Classification and Regression Tasks
Wenbin Zhu
Runwen Qiu
Ying Fu
25
4
0
18 Jan 2024
Encoding categorical data: Is there yet anything 'hotter' than one-hot
  encoding?
Encoding categorical data: Is there yet anything 'hotter' than one-hot encoding?
Ekaterina Poslavskaya
Alexey Korolev
56
7
0
28 Dec 2023
Vectorizing string entries for data processing on tables: when are
  larger language models better?
Vectorizing string entries for data processing on tables: when are larger language models better?
Léo Grinsztajn
Edouard Oyallon
Myung Jun Kim
Gaël Varoquaux
71
3
0
15 Dec 2023
Predicting delays in Indian lower courts using AutoML and Decision
  Forests
Predicting delays in Indian lower courts using AutoML and Decision Forests
M. Bhatnagar
Shivraj Huchhanavar
83
1
0
30 Jul 2023
A benchmark of categorical encoders for binary classification
A benchmark of categorical encoders for binary classification
Federico Matteucci
Vadim Arzamasov
Klemens Boehm
ELM
59
5
0
17 Jul 2023
Saibot: A Differentially Private Data Search Platform
Saibot: A Differentially Private Data Search Platform
Zezhou Huang
Jiaxiang Liu
Daniel Alabi
Raul Castro Fernandez
Eugene Wu
64
7
0
01 Jul 2023
Categorising Products in an Online Marketplace: An Ensemble Approach
Categorising Products in an Online Marketplace: An Ensemble Approach
Kieron Drumm
26
0
0
26 Apr 2023
Progressive Feature Upgrade in Semi-supervised Learning on Tabular
  Domain
Progressive Feature Upgrade in Semi-supervised Learning on Tabular Domain
Morteza Mohammady Gharasuie
Fenjiao Wang
70
0
0
01 Dec 2022
Predicting Treatment Adherence of Tuberculosis Patients at Scale
Predicting Treatment Adherence of Tuberculosis Patients at Scale
Mihir Kulkarni
Satvik Golechha
Rishi Raj
J. Sreedharan
Ankit Bhardwaj
...
Jayakrishna Kurada
S. Mattoo
R. Joshi
K. Rade
Alpa Raval
65
3
0
05 Nov 2022
URANUS: Radio Frequency Tracking, Classification and Identification of
  Unmanned Aircraft Vehicles
URANUS: Radio Frequency Tracking, Classification and Identification of Unmanned Aircraft Vehicles
Domenico Lofú
Pietro Di Gennaro
Pietro Tedeschi
Tommaso Di Noia
E. Sciascio
72
14
0
13 Jul 2022
Fairness Implications of Encoding Protected Categorical Attributes
Fairness Implications of Encoding Protected Categorical Attributes
Carlos Mougan
J. Álvarez
Salvatore Ruggieri
Steffen Staab
FaML
67
16
0
27 Jan 2022
From Strings to Data Science: a Practical Framework for Automated String
  Handling
From Strings to Data Science: a Practical Framework for Automated String Handling
John W. van Lith
Joaquin Vanschoren
17
1
0
02 Nov 2021
Regularized target encoding outperforms traditional methods in
  supervised machine learning with high cardinality features
Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features
F. Pargent
Florian Pfisterer
Janek Thomas
B. Bischl
49
88
0
01 Apr 2021
1