R Find the “groups” of tuples [duplicate]












10
















This question already has an answer here:




  • Make a group_indices based on several columns

    1 answer




I try to find the "group" (id3) based on two variables (id1, id2):



df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
id2 = c('a','b','a','c','c','d','x','y','y','z'),
id3 = c(rep('group1',6), rep('group2',4)))


id1 id2 id3
1 1 a group1
2 1 b group1
3 2 a group1
4 2 c group1
5 3 c group1
6 3 d group1
7 4 x group2
8 4 y group2
9 5 y group2
10 5 z group2


For example id1=1 is related to a and b of id2. But id1=2 is also related to a so both belong to one group (id3=group1). But since id1=2 and id1=3 share id2=c, also id1=3 belongs to that group (id3=1). The values of the tuple ((1,2),('a','b','c')) appear no where else, so no other row belongs to that group (which is labeled group1 generically).



My idea was to create a table based on id3 which would subsequently populated in a loop.



solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
group= 1

for (step in c(1:1000)) { # run many steps to make sure to get all values
solution$id1[group] = # populate
solution$id2[group] = # populate

if (fully populated) {
group = group +1
}}


I am struggling to see how to populate.





Disclaimer: I asked a similar question here, but using names in id2 led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.










share|improve this question















marked as duplicate by Scarabee, Jaap r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
9 hours ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.























    10
















    This question already has an answer here:




    • Make a group_indices based on several columns

      1 answer




    I try to find the "group" (id3) based on two variables (id1, id2):



    df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
    id2 = c('a','b','a','c','c','d','x','y','y','z'),
    id3 = c(rep('group1',6), rep('group2',4)))


    id1 id2 id3
    1 1 a group1
    2 1 b group1
    3 2 a group1
    4 2 c group1
    5 3 c group1
    6 3 d group1
    7 4 x group2
    8 4 y group2
    9 5 y group2
    10 5 z group2


    For example id1=1 is related to a and b of id2. But id1=2 is also related to a so both belong to one group (id3=group1). But since id1=2 and id1=3 share id2=c, also id1=3 belongs to that group (id3=1). The values of the tuple ((1,2),('a','b','c')) appear no where else, so no other row belongs to that group (which is labeled group1 generically).



    My idea was to create a table based on id3 which would subsequently populated in a loop.



    solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
    group= 1

    for (step in c(1:1000)) { # run many steps to make sure to get all values
    solution$id1[group] = # populate
    solution$id2[group] = # populate

    if (fully populated) {
    group = group +1
    }}


    I am struggling to see how to populate.





    Disclaimer: I asked a similar question here, but using names in id2 led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.










    share|improve this question















    marked as duplicate by Scarabee, Jaap r
    Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

    StackExchange.ready(function() {
    if (StackExchange.options.isMobile) return;

    $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
    var $hover = $(this).addClass('hover-bound'),
    $msg = $hover.siblings('.dupe-hammer-message');

    $hover.hover(
    function() {
    $hover.showInfoMessage('', {
    messageElement: $msg.clone().show(),
    transient: false,
    position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
    dismissable: false,
    relativeToBody: true
    });
    },
    function() {
    StackExchange.helpers.removeMessages();
    }
    );
    });
    });
    9 hours ago


    This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.





















      10












      10








      10


      4







      This question already has an answer here:




      • Make a group_indices based on several columns

        1 answer




      I try to find the "group" (id3) based on two variables (id1, id2):



      df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
      id2 = c('a','b','a','c','c','d','x','y','y','z'),
      id3 = c(rep('group1',6), rep('group2',4)))


      id1 id2 id3
      1 1 a group1
      2 1 b group1
      3 2 a group1
      4 2 c group1
      5 3 c group1
      6 3 d group1
      7 4 x group2
      8 4 y group2
      9 5 y group2
      10 5 z group2


      For example id1=1 is related to a and b of id2. But id1=2 is also related to a so both belong to one group (id3=group1). But since id1=2 and id1=3 share id2=c, also id1=3 belongs to that group (id3=1). The values of the tuple ((1,2),('a','b','c')) appear no where else, so no other row belongs to that group (which is labeled group1 generically).



      My idea was to create a table based on id3 which would subsequently populated in a loop.



      solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
      group= 1

      for (step in c(1:1000)) { # run many steps to make sure to get all values
      solution$id1[group] = # populate
      solution$id2[group] = # populate

      if (fully populated) {
      group = group +1
      }}


      I am struggling to see how to populate.





      Disclaimer: I asked a similar question here, but using names in id2 led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.










      share|improve this question

















      This question already has an answer here:




      • Make a group_indices based on several columns

        1 answer




      I try to find the "group" (id3) based on two variables (id1, id2):



      df = data.frame(id1 = c(1,1,2,2,3,3,4,4,5,5),
      id2 = c('a','b','a','c','c','d','x','y','y','z'),
      id3 = c(rep('group1',6), rep('group2',4)))


      id1 id2 id3
      1 1 a group1
      2 1 b group1
      3 2 a group1
      4 2 c group1
      5 3 c group1
      6 3 d group1
      7 4 x group2
      8 4 y group2
      9 5 y group2
      10 5 z group2


      For example id1=1 is related to a and b of id2. But id1=2 is also related to a so both belong to one group (id3=group1). But since id1=2 and id1=3 share id2=c, also id1=3 belongs to that group (id3=1). The values of the tuple ((1,2),('a','b','c')) appear no where else, so no other row belongs to that group (which is labeled group1 generically).



      My idea was to create a table based on id3 which would subsequently populated in a loop.



      solution = data.frame(id3= c('group1', 'group2'),id1=NA, id2=NA)
      group= 1

      for (step in c(1:1000)) { # run many steps to make sure to get all values
      solution$id1[group] = # populate
      solution$id2[group] = # populate

      if (fully populated) {
      group = group +1
      }}


      I am struggling to see how to populate.





      Disclaimer: I asked a similar question here, but using names in id2 led a lot of people point me to fuzzy string procedures in R, which are not needed here, since there exist an exact solution. I also include all code I have tried since then in this post.





      This question already has an answer here:




      • Make a group_indices based on several columns

        1 answer








      r






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 10 hours ago









      les

      455517




      455517










      asked 16 hours ago









      SAFEXSAFEX

      386112




      386112




      marked as duplicate by Scarabee, Jaap r
      Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

      StackExchange.ready(function() {
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function() {
      $hover.showInfoMessage('', {
      messageElement: $msg.clone().show(),
      transient: false,
      position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
      dismissable: false,
      relativeToBody: true
      });
      },
      function() {
      StackExchange.helpers.removeMessages();
      }
      );
      });
      });
      9 hours ago


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









      marked as duplicate by Scarabee, Jaap r
      Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

      StackExchange.ready(function() {
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function() {
      $hover.showInfoMessage('', {
      messageElement: $msg.clone().show(),
      transient: false,
      position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
      dismissable: false,
      relativeToBody: true
      });
      },
      function() {
      StackExchange.helpers.removeMessages();
      }
      );
      });
      });
      9 hours ago


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.


























          1 Answer
          1






          active

          oldest

          votes


















          12














          You can leverage on igraph to find the different clusters of networks



          library(igraph)
          g <- graph_from_data_frame(df, FALSE)
          cg <- clusters(g)$membership
          df$id3 <- cg[df$id1]
          df


          output:



             id1 id2 id3
          1 1 a 1
          2 1 b 1
          3 2 a 1
          4 2 c 1
          5 3 c 1
          6 3 d 1
          7 4 x 2
          8 4 y 2
          9 5 y 2
          10 5 z 2





          share|improve this answer






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            12














            You can leverage on igraph to find the different clusters of networks



            library(igraph)
            g <- graph_from_data_frame(df, FALSE)
            cg <- clusters(g)$membership
            df$id3 <- cg[df$id1]
            df


            output:



               id1 id2 id3
            1 1 a 1
            2 1 b 1
            3 2 a 1
            4 2 c 1
            5 3 c 1
            6 3 d 1
            7 4 x 2
            8 4 y 2
            9 5 y 2
            10 5 z 2





            share|improve this answer




























              12














              You can leverage on igraph to find the different clusters of networks



              library(igraph)
              g <- graph_from_data_frame(df, FALSE)
              cg <- clusters(g)$membership
              df$id3 <- cg[df$id1]
              df


              output:



                 id1 id2 id3
              1 1 a 1
              2 1 b 1
              3 2 a 1
              4 2 c 1
              5 3 c 1
              6 3 d 1
              7 4 x 2
              8 4 y 2
              9 5 y 2
              10 5 z 2





              share|improve this answer


























                12












                12








                12







                You can leverage on igraph to find the different clusters of networks



                library(igraph)
                g <- graph_from_data_frame(df, FALSE)
                cg <- clusters(g)$membership
                df$id3 <- cg[df$id1]
                df


                output:



                   id1 id2 id3
                1 1 a 1
                2 1 b 1
                3 2 a 1
                4 2 c 1
                5 3 c 1
                6 3 d 1
                7 4 x 2
                8 4 y 2
                9 5 y 2
                10 5 z 2





                share|improve this answer













                You can leverage on igraph to find the different clusters of networks



                library(igraph)
                g <- graph_from_data_frame(df, FALSE)
                cg <- clusters(g)$membership
                df$id3 <- cg[df$id1]
                df


                output:



                   id1 id2 id3
                1 1 a 1
                2 1 b 1
                3 2 a 1
                4 2 c 1
                5 3 c 1
                6 3 d 1
                7 4 x 2
                8 4 y 2
                9 5 y 2
                10 5 z 2






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 16 hours ago









                chinsoon12chinsoon12

                9,18111320




                9,18111320

















                    Popular posts from this blog

                    Усть-Каменогорск

                    Халкинская богословская школа

                    Высокополье (Харьковская область)