2nd International Workshop on
challenges and experiences
from Data Integration to Knowledge Graphs

August 31 - September 4, 2020

Tokyo, Japan

Held in conjunction with VLDB 2020


DI2KG Benchmark datasets

DI2KG provides datasets of product specifications for CAMERA and MONITOR categories, extracted from multiple eCommerce websites.

The specification files consist of a series of key and value pairs (attributes), e.g.:

"<page title>": "Samsung Smart WB50F Digital Camera White Price in India with Offers & Full Specifications | PriceDekho.com",
"additional features": "Color\nWhite",
"brand": "Samsung",
"connectivity system req": "USB\nUSB 2.0",
"dimension": "Dimensions\n101 x 68 x 27.1 mm\nWeight\n157 gms",
"display": "Display Type\nLCD\nScreen Size\n3 Inches",
"general features": "Brand\nSamsung\nAnnounced\n2014, February\nStatus\nAvailable",
"lens": "Auto Focus\nCenter AF, Face Detection, Multi AF\nFocal Length\n4.3 - 51.6 mm (35 mm Equivalent to 24 - 288 mm)",
"media software": "Memory Card Type\nSD, SDHC, SDXC",
"optical sensor resolution in megapixel": "16.2 MP",
"other features": "ISO Rating\nAuto / 80 / 100 / 200 / 400 / 800 / 1600 / 3200\nSelf Timer\n2 sec, 10 sec\nFace Detection\nYes\nImage Stabilizer\nOptical\nMetering\nCenter, Multi, Spot\nExposure Compensation\n1/3 EV Steps, +/-2.0 EV\nMacro Mode (Exposure Mode)\n5 - 80 cm (W)\nRed Eye Reduction\nYes\nWhite Balancing\nAuto\nMicrophone\nBuilt-In Monaural Microphone",
"pixels": "Optical Sensor Resolution (in MegaPixel)\n16.2 MP",
"sensor": "Sensor Type\nCCD Sensor\nSensor Size\n1/2.3 Inches",
"sensor type": "CCD Sensor",
"shutter speed": "Maximum Shutter Speed\n1/2000 sec\nMinimum Shutter Speed\n2 sec",
"zoom": "Optical Zoom\n12x\nDigital Zoom\n2x"

Labelled data for CAMERA dataset are available for data integration tasks of 2019 challenge:

  • Entity Resolution
  • Schema Alignment
  • Knowledge Graph Augmentation
Dataset format is defined in challenge section of 2019 challenge website. Download labelled data v3.0

A full Ground truth will be available at the end of this year challenge.

Note that labelled data and tasks for this year challenge is to be defined.


Contact us