R Find the “groups” of tuples [duplicate]
This question already has an answer here:
Make a group_indices based on several columns
1 answer
I try to find the "group" (id3
) based on two variables (id1
, id2
):
df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
id2 = c('a','b','a','c','c','d','x','y','y','z'),
id3 = c(rep('group1',6), rep('group2',4)))
id1 id2 id3
1 1 a group1
2 1 b group1
3 2 a group1
4 2 c group1
5 3 c group1
6 3 d group1
7 4 x group2
8 4 y group2
9 5 y group2
10 5 z group2
For example id1=1
is related to a
and b
of id2
. But id1=2
is also related to a
so both belong to one group (id3=group1
). But since id1=2
and id1=3
share id2=c
, also id1=3
belongs to that group (id3=1
). The values of the tuple ((1,2),('a','b','c'))
appear no where else, so no other row belongs to that group (which is labeled group1
generically).
My idea was to create a table based on id3
which would subsequently populated in a loop.
solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
group= 1
for (step in c(1:1000)) { # run many steps to make sure to get all values
solution$id1[group] = # populate
solution$id2[group] = # populate
if (fully populated) {
group = group +1
}}
I am struggling to see how to populate.
Disclaimer: I asked a similar question here, but using names in id2
led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.
r
marked as duplicate by Scarabee, Jaap
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
9 hours ago
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
Make a group_indices based on several columns
1 answer
I try to find the "group" (id3
) based on two variables (id1
, id2
):
df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
id2 = c('a','b','a','c','c','d','x','y','y','z'),
id3 = c(rep('group1',6), rep('group2',4)))
id1 id2 id3
1 1 a group1
2 1 b group1
3 2 a group1
4 2 c group1
5 3 c group1
6 3 d group1
7 4 x group2
8 4 y group2
9 5 y group2
10 5 z group2
For example id1=1
is related to a
and b
of id2
. But id1=2
is also related to a
so both belong to one group (id3=group1
). But since id1=2
and id1=3
share id2=c
, also id1=3
belongs to that group (id3=1
). The values of the tuple ((1,2),('a','b','c'))
appear no where else, so no other row belongs to that group (which is labeled group1
generically).
My idea was to create a table based on id3
which would subsequently populated in a loop.
solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
group= 1
for (step in c(1:1000)) { # run many steps to make sure to get all values
solution$id1[group] = # populate
solution$id2[group] = # populate
if (fully populated) {
group = group +1
}}
I am struggling to see how to populate.
Disclaimer: I asked a similar question here, but using names in id2
led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.
r
marked as duplicate by Scarabee, Jaap
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
9 hours ago
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
Make a group_indices based on several columns
1 answer
I try to find the "group" (id3
) based on two variables (id1
, id2
):
df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
id2 = c('a','b','a','c','c','d','x','y','y','z'),
id3 = c(rep('group1',6), rep('group2',4)))
id1 id2 id3
1 1 a group1
2 1 b group1
3 2 a group1
4 2 c group1
5 3 c group1
6 3 d group1
7 4 x group2
8 4 y group2
9 5 y group2
10 5 z group2
For example id1=1
is related to a
and b
of id2
. But id1=2
is also related to a
so both belong to one group (id3=group1
). But since id1=2
and id1=3
share id2=c
, also id1=3
belongs to that group (id3=1
). The values of the tuple ((1,2),('a','b','c'))
appear no where else, so no other row belongs to that group (which is labeled group1
generically).
My idea was to create a table based on id3
which would subsequently populated in a loop.
solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
group= 1
for (step in c(1:1000)) { # run many steps to make sure to get all values
solution$id1[group] = # populate
solution$id2[group] = # populate
if (fully populated) {
group = group +1
}}
I am struggling to see how to populate.
Disclaimer: I asked a similar question here, but using names in id2
led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.
r
This question already has an answer here:
Make a group_indices based on several columns
1 answer
I try to find the "group" (id3
) based on two variables (id1
, id2
):
df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
id2 = c('a','b','a','c','c','d','x','y','y','z'),
id3 = c(rep('group1',6), rep('group2',4)))
id1 id2 id3
1 1 a group1
2 1 b group1
3 2 a group1
4 2 c group1
5 3 c group1
6 3 d group1
7 4 x group2
8 4 y group2
9 5 y group2
10 5 z group2
For example id1=1
is related to a
and b
of id2
. But id1=2
is also related to a
so both belong to one group (id3=group1
). But since id1=2
and id1=3
share id2=c
, also id1=3
belongs to that group (id3=1
). The values of the tuple ((1,2),('a','b','c'))
appear no where else, so no other row belongs to that group (which is labeled group1
generically).
My idea was to create a table based on id3
which would subsequently populated in a loop.
solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
group= 1
for (step in c(1:1000)) { # run many steps to make sure to get all values
solution$id1[group] = # populate
solution$id2[group] = # populate
if (fully populated) {
group = group +1
}}
I am struggling to see how to populate.
Disclaimer: I asked a similar question here, but using names in id2
led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.
This question already has an answer here:
Make a group_indices based on several columns
1 answer
r
r
edited 10 hours ago
les
455517
455517
asked 16 hours ago
SAFEXSAFEX
386112
386112
marked as duplicate by Scarabee, Jaap
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
9 hours ago
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by Scarabee, Jaap
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
9 hours ago
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You can leverage on igraph
to find the different clusters of networks
library(igraph)
g <- graph_from_data_frame(df, FALSE)
cg <- clusters(g)$membership
df$id3 <- cg[df$id1]
df
output:
id1 id2 id3
1 1 a 1
2 1 b 1
3 2 a 1
4 2 c 1
5 3 c 1
6 3 d 1
7 4 x 2
8 4 y 2
9 5 y 2
10 5 z 2
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can leverage on igraph
to find the different clusters of networks
library(igraph)
g <- graph_from_data_frame(df, FALSE)
cg <- clusters(g)$membership
df$id3 <- cg[df$id1]
df
output:
id1 id2 id3
1 1 a 1
2 1 b 1
3 2 a 1
4 2 c 1
5 3 c 1
6 3 d 1
7 4 x 2
8 4 y 2
9 5 y 2
10 5 z 2
add a comment |
You can leverage on igraph
to find the different clusters of networks
library(igraph)
g <- graph_from_data_frame(df, FALSE)
cg <- clusters(g)$membership
df$id3 <- cg[df$id1]
df
output:
id1 id2 id3
1 1 a 1
2 1 b 1
3 2 a 1
4 2 c 1
5 3 c 1
6 3 d 1
7 4 x 2
8 4 y 2
9 5 y 2
10 5 z 2
add a comment |
You can leverage on igraph
to find the different clusters of networks
library(igraph)
g <- graph_from_data_frame(df, FALSE)
cg <- clusters(g)$membership
df$id3 <- cg[df$id1]
df
output:
id1 id2 id3
1 1 a 1
2 1 b 1
3 2 a 1
4 2 c 1
5 3 c 1
6 3 d 1
7 4 x 2
8 4 y 2
9 5 y 2
10 5 z 2
You can leverage on igraph
to find the different clusters of networks
library(igraph)
g <- graph_from_data_frame(df, FALSE)
cg <- clusters(g)$membership
df$id3 <- cg[df$id1]
df
output:
id1 id2 id3
1 1 a 1
2 1 b 1
3 2 a 1
4 2 c 1
5 3 c 1
6 3 d 1
7 4 x 2
8 4 y 2
9 5 y 2
10 5 z 2
answered 16 hours ago
chinsoon12chinsoon12
9,18111320
9,18111320
add a comment |
add a comment |