Aiuto nella programmazione, risposte alle domande / c / Aggiunta di righe da un oggetto IEnumerable a un altro in base alle condizioni: c #, linq, c # -4.0

Aggiunta di righe da un oggetto IEnumerable a un altro in base alle condizioni: c #, linq, c # -4.0

Ho due array ..

var data1 = new[] {
new { Product = "Product 1", Year = 2009, Sales = 1212 },
new { Product = "Product 2", Year = 2009, Sales = 522 },
new { Product = "Product 1", Year = 2010, Sales = 1337 },
new { Product = "Product 2", Year = 2011, Sales = 711 },
new { Product = "Product 2", Year = 2012, Sales = 2245 },
new { Product = "Product 3", Year = 2012, Sales = 1000 }
};

var data2 = new[] {
new { Product = "Product 1", Year = 2009, Sales = 1212 },
new { Product = "Product 1", Year = 2010, Sales = 1337 },
new { Product = "Product 2", Year = 2011, Sales = 711 },
new { Product = "Product 2", Year = 2012, Sales = 2245 }
};

Quello che voglio fare è controllare ogni distinto Product e Year in data2e se esiste una riga per qualsiasi combinazione di tali Product e Year in data1 ma non in data2 quindi aggiungi quella riga a data2.

Esempio.. Nel data2, i prodotti distinti sono Product1 e Product2 e anni distinti sono Year1, Year2, Year3 e Year4.

In data1 esiste una riga { Product = "Product 2", Year = 2009, Sales = 522 }, che non è presente in data2, quindi desidero aggiungerlo a data2.

Quello che posso fare è ottenere prodotti e anni distinti in due variabili.

Quindi esegui un ciclo per ogni ciclo in entrambi e controlla se la combinazione esiste in data1 ma non in data2 e, in tal caso, aggiungila a data2.

Quello che vorrei ottenere è una singola query LINQ che può fare questo lavoro per me invece di farne due distinte separatamente e poi eseguirne un paio per ogni ciclo.

Grazie

risposte:

2 per risposta № 1

Puoi farlo funzionare in una singola query. Tuttavia non sarà ottimale, perché per ogni elemento in data1 dovresti controllare tre condizioni, che potenzialmente richiedono di essere esaminate interamente data2 per una complessità temporale O (m * n) (la complessità spaziale rimane O (1), però).

Puoi evitare lo stesso ciclo, però:

var uniqueProd = new HashSet<string>(data2.Select(d=>d.Product));
var uniqueYear = new HashSet<int>(data2.Select(d=>d.Year));
var knownPairs = new HashSet<Tuple<string,int>>(
data2.Select(d=>Tuple.Create(d.Product, d.Year))
);
var newData2 = data2.Concat(
data1.Where(d =>
uniqueProd.Contains(d.Product)                       // The product is there
&&  uniqueYear.Contains(d.Year)                          // The year is there
&& !knownPairs.Contains(Tuple.Create(d.Product, d.Year)) // Combination is not there
)
).ToArray();

Questa soluzione è O (m + n) nel tempo e anche O (n) nello spazio.

1 per risposta № 2

Non avrò alcuna pretesa di efficienza, ma è possibile in una singola query.

Se ti accontenti di lasciare che Union gestisca la rimozione dei duplicati, puoi fare:

var newd2 = data2.Union(
from d1 in data1
where
(from d2p in data2 from d2y in data2
select new { d2p.Product, d2y.Year })
.Distinct().Any(mp => mp.Product == d1.Product && mp.Year == d1.Year)
select d1);

In alternativa, puoi escludere corrispondenze data2 preesistenti e utilizzare Concat

var newd2 = data2.Concat(
from d1 in data1
where
(from d2p in data2 from d2y in data2 select new { d2p.Product, d2y.Year })
.Distinct().Any(mp => mp.Product == d1.Product && mp.Year == d1.Year) &&
!data2.Any(mp => mp.Product == d1.Product && mp.Year == d1.Year)
select d1
);

OTOH, non ho potuto resistere ad alcuni tempi.Se chiamiamo l'utilizzo di Union come 1, l'utilizzo di Concat varia dal 73% delle volte, la creazione di HashSet utilizza l'827% delle volte e l'estrazione della coppia unica esposta richiede il 54% e saltare il .Distinct () richiede il 27%, sebbene il set di dati è troppo lento per distinguere le differenze in alcuni di questi.

Estrarre le coppie e scaricare Distinct:

var newdd = (from d2p in data2 from d2y in data2 select new { d2p.Product, d2y.Year });
var newd2 = data2.Concat(
from d1 in data1
where
newdd.Any(mp => mp.Product == d1.Product && mp.Year == d1.Year) &&
!data2.Any(mp => mp.Product == d1.Product && mp.Year == d1.Year)
select d1
);