Skip to Main Content (Press Enter)

Logo UNIECAMPUS
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Competenze

UNI-FIND
Logo UNIECAMPUS

|

UNI-FIND

uniecampus.it
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Competenze
  1. Pubblicazioni

SparkER: Scaling Entity Resolution in Spark

Contributo in Atti di convegno
Data di Pubblicazione:
2019
Abstract:
We present SparkER, an ER tool that can scale practitioners’ favorite ER algorithms. SparkER has been devised to take full ad- vantage of parallel and distributed computation as well (running on top of Apache Spark). The first SparkER version was focused on the blocking step and implements both schema-agnostic and Blast meta-blocking approaches (i.e. the state-of-the-art ones); a GUI for SparkER, to let non-expert users to use it in an unsupervised mode, was developed. The new version of SparkER to be shown in this demo, extends significantly the tool. Entity matching and Entity Clustering modules have been added. Moreover, in addition to the completely unsupervised mode of the first version, a supervised mode has been added. The user can be assisted in supervising the entire process and in injecting his knowledge in order to achieve the best result. During the demonstration, attendees will be shown how SparkER can significantly help in devising and debugging ER algorithms.
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Elenco autori:
Gagliardelli, Luca; Simonini, Giovanni; Beneventano, Domenico; Bergamaschi, Sonia
Autori di Ateneo:
GAGLIARDELLI LUCA
Link alla scheda completa:
https://iris.uniecampus.it/handle/11389/69818
Titolo del libro:
Advances in Database Technology - EDBT 2019, 22nd International Conference on Extending Database Technology, Lisbon, Portugal, March 26-29, Proceedings
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.2.0